The MVP of Intelligence
Not yet the official release.
Terms with a dotted underline open ChatGPT in a new tab with a ready-made simple explanation prompt.
I use AI every day.
Not as a spectator. Not as a journalist. Not as someone commenting from a safe distance.
I use it to build systems, write code, test ideas, accelerate prototypes, and recover a level of technical independence I had lost after decades of relying on technical teams.
So I do not write this as an AI skeptic.
I write this as someone whose life and work have already been transformed by AI.
And precisely because of that, I think we must be careful with one increasingly dangerous idea: the belief that large language models have already delivered satisfactory AGI.
I do not believe that.
I am closer to Yann LeCun’s skepticism than to the stronger confidence expressed by people like Dario Amodei. Not because today’s models are weak. They are not. They are extraordinary. But because I use them in real systems, and I see their weaknesses every day.
The miracle is real
LLMs are a revolution.
They write, explain, summarize, translate, classify, code, debug, compare, and help humans move faster than ever before. They reduce the distance between an idea and a working prototype. They allow small teams and even individuals to attempt things that previously required larger technical organizations.
This is not a small improvement.
It is a major shift.
For builders, LLMs are a form of cognitive exoskeleton. They give speed, reach, and confidence. They make execution cheaper. They make learning faster. They make experimentation almost continuous.
But this is where the confusion begins.
A system can be transformative without being AGI.
A tool can change history without being generally intelligent.
A prototype can be powerful without being the final architecture.
We are simulating intelligence
Today’s LLMs often produce the appearance of intelligence with astonishing quality.
They can answer like experts, write like consultants, code like developers, reason like analysts, and speak like patient tutors.
But this fluency is not the same as deep understanding.
LLMs are still fragile. They can be brilliant in one moment and strangely wrong in the next. They can solve a difficult problem, then miss an obvious constraint. They can generate convincing explanations that hide uncertainty. They can follow context, then lose it. They can produce working code, but also introduce invisible bugs.
This does not make them useless.
It makes them dangerous when we forget what they are.
In my opinion, current AI systems are prototypes of intelligent systems. Very powerful prototypes. Very useful prototypes. But prototypes nonetheless.
And prototypes must not be mistaken for foundations.
The danger is overconfidence
My prediction is simple.
The people who believe that current LLMs already provide satisfactory AGI may become dangerous if they become the architects of large-scale systems for the public, for critical industries, or for sensitive infrastructure.
Not because they are bad people.
Because they may build with the wrong assumption.
If you believe the intelligence problem is basically solved, you design differently. You automate faster. You remove human checks earlier. You trust the system more deeply. You accept weaker verification. You tolerate less friction. You confuse a fluent answer with a reliable conclusion.
That is where the danger begins.
The real risk is not only that AI will make mistakes. All systems make mistakes.
The real risk is that we will build processes around AI systems whose failure modes we do not yet understand, then discover those failures only when the systems are already deployed at scale.
The real world is not a demo
Demos are forgiving.
The real world is not.
In a demo, the user asks a reasonable question. The data is clean. The context is controlled. The task is narrow. The output is judged by impression.
In reality, users are confused. Data is incomplete. Instructions conflict. Legal constraints matter. Time matters. Memory matters. Edge cases multiply. Responsibility matters.
This is where today’s AI systems show their limits.
They can accelerate work dramatically, but they do not automatically create reliable architecture. They can generate decisions, but they do not automatically know when they should not decide. They can produce confidence, but they do not automatically deserve trust.
This distinction is essential.
Automation is not intelligence.
Fluency is not judgment.
Prediction is not understanding.
Output is not responsibility.
AI will expose old systems — and itself
One of the most useful effects of AI is that it will expose the weaknesses of existing systems.
It will reveal bad processes, unclear documentation, weak databases, fake compliance, inefficient management, and fragile software. In that sense, AI is like a powerful diagnostic tool injected into organizations. It shows where things are already broken.
But the same will happen to AI systems themselves.
The systems we are building today will also be exposed. Their weaknesses will appear under pressure: hallucinations, poor grounding, weak state management, missing context, bad escalation logic, unclear responsibility, and dangerous confidence.
We should not wait to discover these flaws by accident.
We should actively search for them.
Architecture must start from weakness
As an AI solution architect, I increasingly believe that serious AI systems must be designed around weaknesses, not only around capabilities.
A good AI architecture should not assume that the model is always right.
It should assume uncertainty.
It should separate suggestion from decision.
It should verify important outputs.
It should keep logs.
It should preserve context.
It should ask for clarification when needed.
It should escalate high-risk cases to humans.
It should avoid pretending that fluent language equals authority.
In other words, humility must become an architectural principle.
Not ethics decoration.
Not marketing.
Architecture.
If a system cannot fail safely, it is not ready for serious deployment.
If a system cannot express uncertainty, it should not be trusted blindly.
If a system cannot be audited, it should not be used in critical decisions.
We are not at the holy grail
We are about to automate an almost infinite list of processes.
That is already enormous.
Customer support, education, coding, logistics, administration, legal preparation, document analysis, research, translation, marketing, and internal workflows will all be transformed.
But even a quantum leap in automation is not the same thing as the intelligent holy grail.
The spreadsheet was not a financial mind.
The search engine was not a librarian.
The industrial robot was not a human worker.
And the LLM is not yet a complete intelligence.
It is a powerful language engine. A universal interface. A reasoning simulator. A cognitive amplifier. A prototype component for future intelligent systems.
That is already enough to change the world.
But it is not enough to declare victory.
The paradox
Here is the paradox I want to leave open.
Perhaps the first truly more intelligent AI system will not be the one that claims to be AGI.
Perhaps it will be the one that can prove that today’s AI is not AGI.
In my opinion, one of the most urgent things we need is a benchmarking model that automatically updates itself and whose main purpose is to find weaknesses in the systems we are developing before we discover them accidentally.
Not a static benchmark designed to celebrate progress.
Not a leaderboard optimized for marketing.
A living adversarial benchmark.
A system that constantly tests AI architectures against real-world ambiguity, edge cases, changing contexts, hidden assumptions, and failure modes.
A system that does not ask, “How impressive is this model?”
But instead asks:
- Where does it break?
- Where does it pretend to know?
- Where does it lose context?
- Where does it make decisions it should not make?
- Where does the architecture fail?
- Where does human responsibility disappear?
If we ever build such a system well, it may be more intelligent than many systems claiming intelligence today.
Because real intelligence begins with knowing the limits of intelligence.
Conclusion
I remain enthusiastic about AI.
I build with it. I learn with it. I use it every day. I believe it will transform almost every process we know.
But I do not believe current LLMs are satisfactory AGI.
They are powerful prototypes of intelligent systems.
And if we mistake prototypes for foundations, the consequences may be severe.
The danger will not necessarily look like science fiction.
It may look like ordinary product decisions.
A missing safeguard.
A silent hallucination.
A confident answer.
An automated workflow.
A human removed too early.
A system deployed before its weaknesses were understood.
That is why the next phase of AI should not only be about making systems more capable.
It should be about making them better at finding their own limits.
Because the first system that can truly prove today’s AI is not AGI may be the first one that deserves to be called more intelligent.