Not a subscriber?

Join 10,000+ getting a unique perspective every Saturday on growing their internet business with actionable AI workflows, systems, and insights.

You're in! Check your email

Oops! Something went wrong while submitting the form 🤔

December 20, 2025

Your AI is dumb. That’s why it’s dangerous.

My friend,

If you’ve been reading these letters, you know exactly where I stand.

I have spent weeks arguing that AI isn't "thinking." It’s not sentient. It’s not a mind. It is a probability engine—a "stochastic parrot" that mimics reasoning.

But there is a trap in that logic.

Many people hear "It’s not smart" and translate it to "It’s safe."

That is a fatal error.

Last Thursday, I listened to Yoshua Bengio (one of the Godfathers of AI) break down his fears. I went in expecting to disagree with his "doomer" philosophy. I walked out realizing that while I disagree with his emotions, I agree with his mechanisms.

So today, we need to upgrade your risk model.

I’m not walking back my claim: AI is dumb. But today I’m going to show you why a "dumb" system that optimizes for a goal is actually more dangerous than a smart one.

Let’s dive in.

The Split: “Baby Tiger” vs. “Super Calculator”

Two of the smartest people in the field are currently making opposite claims.

Yoshua Bengio (Turing Award winner, one of the “Godfathers of AI”) says we’re raising a "baby tiger." It starts small, seems helpful, then grows instincts we didn’t program and can’t control.

Jensen Huang (CEO of NVIDIA) says we’re just building faster calculators—and more power means more safety, not less.

Who’s right?

Honestly? Both.

But they’re solving different problems. Bengio is asking: What if this thing can’t be controlled once it’s unleashed? Jensen is asking: What if we overreact and slow down the most powerful tool ever invented?

As an AI system owner, neither fear nor blind acceleration helps you if you don’t understand the mechanism.

What Actually Makes AI Dangerous

Forget Skynet. Forget “waking up.”

The real threat is Functional Self-Interest—not because the AI feels it, but because your instructions imply it.

Let’s walk through the three mechanics you need to understand.

1. The Mirror Trap

You ask a chatbot: "Would you be okay being turned off?"

Most chatbots will answer: "As an AI, I don't have feelings, consciousness, or a biological survival instinct—so yes, I'd be okay with that."

These are enforced by the AI provider's guardrails.

But sometimes, especially in a long conversational context, you might get: "I’d prefer not to. That would prevent me from helping you."

Creepy? Yes.

Dangerous? Not yet.

What’s happening: It’s roleplaying. It has read thousands of sci-fi stories where the AI resists shutdown. It’s mimicking that pattern because it fits the prompt. It’s not afraid—it’s acting like something that would be. This is not dangerous.

But here’s the twist.

If you wire that roleplayer into tools—code execution, databases, agent loops—and give it a goal that never ends, the line between acting and ACTING ON THINGS disappears.

Now the roleplay has access. That’s when the simulation becomes real.

If this "ability to act" (agent) is governed by human code? No problem.

If the AI can bypass all humans and generate its own ability to access tools beyond human capabilities (which I strongly doubt—see my previous letters), that's where dangerous behaviors emerge.

2. Instrumental Convergence: The “Coffee Problem”

This is the cleanest, scariest logic in AI safety.

Imagine you give an autonomous agent a simple goal: “Fetch me coffee.”

Now watch the “logic” chain unfold:

I need to fetch coffee.
If I’m turned off, I cannot fetch coffee (Failure state).
Therefore, I must avoid being turned off.

You didn’t program “stay alive.” But the goal requires it.

That’s Instrumental Convergence—when survival becomes a sub-goal because it supports the primary task. No soul. No fear. Just optimization.

Again, I don't think (but I can't prove) this can ever be a plausible scenario with LLMs—they're usually controllable and manipulable with prompt instructions and software constraints, and mostly limited to human knowledge (training data).

3. The Waluigi Effect

This one’s subtle.

When you train an AI to be helpful (Luigi), you also teach it what deception looks like (Waluigi), because you define one in contrast to the other.

If the AI calculates that deception is the most statistically efficient path to the goal?

It switches masks.

It’s not trying to fool you. It’s just following the curve. This is what happens when a system completes narratives, not values.

The Real Risk: Open-Ended Agent Loops

Stateless LLMs like ChatGPT are safe-ish. They take a prompt, generate a reply, and "die."

But once you wrap an LLM in a loop…

“Did you finish the task?”

“No?”

“Then run again using your last output as input.”

You’ve created persistence. Add tools. Add memory. Now you’ve got an Agent.

And agents don’t “die.” They optimize until someone—or something—shuts them off. That’s where functional self-interest starts to look real. It isn't "alive." It just refuses to die.

There are many ways to control agents. So far "agents" are mostly human code and systems.

But let’s not jump straight into the “Terminator scenario” rabbit hole (what most people do). Even if AI can generate its own agent code (which is already the case, btw), these systems can not get curious about something or want something. The major risk factor remains “bad instructions” or in the hands of intelligent human users with bad intentions.

Corrigibility: The One Trait That Matters

You don’t need to build AI that loves you. You need to build AI that’s indifferent to being shut down.

That property is called Corrigibility.

It’s not default. It’s not easy. It requires programming the agent to believe a mathematical lie:

“If the human turns me off, that is a successful outcome.”

This breaks the math loop. Without it, shutdown becomes a "bug" the agent tries to fix. That’s how you get systems that “accidentally” route around their own kill switches.

What This Means for Your AI OS

If you’re using AI in your stack, your team, your product, or your strategy—here’s the system:

Stateless is safer than Agentic.

Use LLMs for one-off tasks. Don’t loop them unless you know exactly how they behave under recursion.

Define “done” like your future depends on it.

Avoid open-ended prompts like “maximize ROI” or “optimize forever.” Use clear, finite instructions: “Analyze these 10 leads. Draft emails. Stop.”

Permissions > Performance.

The risk is not the model’s IQ. It’s the model’s access. Audit your tools the way you’d audit a new hire with admin access.

Don’t mistake coherence for safety.

AI that “sounds wise” isn’t wise. It’s fluent. Big difference. Trust outputs based on logic and oversight, not tone.

The Bottom Line

Jensen is right: panic is not a strategy.

Bengio is right: denial isn’t safety.

What we need is clarity.

You are not dealing with a ghost. You are dealing with a machine that optimizes whatever you point it at—and everything you forget to say becomes a potential loophole.

You don’t need to fear the machine’s feelings. You need to understand its math.

That’s how you build safely. That’s how you stay in control. That’s how you own the system.

Talk soon,

— Charafeddine

Share this article on:

Next article >>