cognition – Apes on fire

I’ll start with a confession:

I was wrong.

Not about AI being powerful. It is.

Also, not about AI changing software work. It already has.

I was wrong about what kind of thing AI is. I assumed, at first, that AI might simply be “more intelligent” than humans in the way a crane is stronger than a person: bigger machine, faster output, same category.

After ~14 months of building with coding agents — shipping prototypes, breaking systems, rebuilding them, and moving from a locally run CLI toy into a real platform — I don’t think that anymore.

What I see now is this: AI is not a better human mind, it’s a different cognitive architecture all together. If you miss that, you will misread both AI and human work. A tiny lapse in reasoning, that sits underneath a lot of the current AI discourse. It’s also why “software is dead” hot takes sound clever on social media and then die the moment you need auth, billing, persistence, observability, or a system that still works on Tuesday.

Thesis 1: Clarity is kindness

The first thing coding agents taught me about human work: Clarity is not bureaucracy. Clarity is kindness.

Kindness to your team. Kindness to your future self.

Kindness to the machine you just asked to produce 5,000 lines of code before lunch.

LLM-based agents are wildly capable. But in their cognitive core, the LLM doing all the “thinking” churn, still operates as bursts of token throughput: tokens in, inference, tokens out. Let me be clear, in case there is any doubt: this is NOT how human brains work. Humans live in something else entirely: a continuous cognitive stream. We keep context alive across time (within the boundaries of our long-term and short-term memory). We carry intent. We revisit assumptions. We ask, nonstop:

Is this still the right direction?
What problem are we actually solving?
What are the non-goals?
Which constraint is real, and which one is just noise?

That loop is not overhead — we call it ‘inner monologue’, ‘strategic thinking’, ‘executive functions’. And however you want to call it: that loop is the work.

In long development sessions with coding agents, we’ve seen this pattern clearly reflected: what we are doing is often not “coding” per se. Coding agents have focused almost our entire developer time on doing directional labor:

defining scope
goals setting
non-goals definition
specs writing
requirements alignment
sequencing constraints
sharpening product intent

Yes, the AI can generate pieces of that. But it doesn’t have your intent. It doesn’t know your taste. It doesn’t know which compromise is acceptable and which one would quietly wreck the product six weeks from now.

This is not just an anecdotal founder rant. Anthropic’s 2025 internal study (132 engineers/researchers, 53 interviews, internal Claude Code usage data) found strong AI use for debugging and code understanding, with big self-reported productivity shifts — but also explicit concern about losing deep technical competence, weakening collaboration, and needing new approaches to learning and mentorship. They describe this as an early signal of broader societal transformation.

That tracks exactly with what we’ve seen:

The agent can move fast.

It cannot care.

It’s the equivalent of a self-driving chainsaw. Human judgment is the only thing between your code and its teeth.

Thesis 2: Vibe architecture is no architecture

The funniest and most dangerous lie in AI right now is the idea that, because “vibe coding” can produce software, architecture no longer matters.

It matters more.

Coding agents can produce impressive looking output fast, and it was still the wrong move.

Our early version was a local CLI MVP. Great. Fast. Useful. Then we moved toward a real platform and the grown-up questions arrived immediately:

user identity
authentication
storage/persistence
billing
deployment strategy
infrastructure
observability
failure modes

That’s where many people discover: “generate app” is not the same ask as “design a system.”

It’s not that AI can’t help with these kinds of problems. It absolutely can. It can accelerate implementation and explore options quickly. But the truth is modern software development is a series of deliberate choices. If you don’t know the landscape, if you don’t understand the option space, a coding agent will happily assist you as you “vibe code” yourself into a ~~back~~deadend you never meant to even build in the first place.

I’ve done it. Several times.

And that is not an AI failure. It’s a leadership failure. A product failure. An architecture failure.

The benchmarks are quietly saying the same thing. OpenAI’s SWE-Lancer benchmark used 1,400+ real freelance software tasks (including managerial decision tasks), and OpenAI explicitly reports that frontier models were still unable to solve the majority of tasks. METR’s randomized trial with experienced open-source developers on their own repos found that, in that setting, AI tool use made them 19% slower on average—even though the developers expected speedups. METR also stresses not to overgeneralize, but the result is a useful antidote to benchmark fantasy.

That doesn’t mean AI is bad. It just means reality is large.

So yes, vibe coding is real. It’s useful, and it can be magical. But It is also often a speedrun into hidden complexity.

Vibe architecture is no architecture.

Thesis 3: Creativity does not come from abundance

The third thing coding agents taught me surprised me the most.

AI makes cognition feel abundant:

Need 20 implementation paths? Done.

Need 10 names? Done.

Need 4 refactor strategies? Done.

But creativity does not thrive in abundance. Innovation is born from scarcity. And creativity is innovation + relevance, optimized under utility constraints.

That last part matters: utility constraints.

A coding agent can be inventive. It can absolutely produce novel moves. But novelty is not creativity by itself. Creativity starts when someone makes a judgment:

this is the direction
these options are out
this tradeoff is worth it
this is elegant enough
this is useful enough
this is aligned

In other words: creativity is not just generation.

Creativity is selection under constraints.

And selection is painful. It means cutting away options, aying no. It means carrying the weight of taste, context, and accountability.

Machines are very good at generating options. Humans are still doing most of the meaningful reduction.

This is where the broader evidence is nuanced. The OECD’s 2025 review of experimental evidence summarizes real productivity gains (often 5% to 25%+ in the right tasks), especially when task fit is good — but also emphasizes that benefits depend on user skill, output evaluation, and proper use. They also flag a real risk: over-reliance can reduce independent thinking if people stop critically engaging with outputs.

AI doesn’t eliminate the need for human judgment. It dramatically raises the cost of not having any.

This is not a software story, but a civilization story

If machines become abundant generators, then human value shifts upstream and downstream:

upstream: framing, intent, constraint design, ethics, taste
downstream: judgment, integration, accountability, consequences

You can see this in the current public discourse around coding roles: even people building agent tools are saying the center of gravity is moving from typing code to writing specs, defining intent, and talking to users. Boris Cherny, creator of Claude Code, said he expects major role shifts and more emphasis on spec work. Stanford HAI’s expert predictions similarly point toward collaborative agent systems with humans providing high-level guidance — and note the growing pressure to prove real-world value, not just demos.

And globally, the labor signal is neither utopian nor apocalyptic. The ILO’s 2025 update says one in four workers is in an occupation with some degree of GenAI exposure, but also emphasizes that most jobs are more likely to be transformed than eliminated, because human input remains necessary. Meanwhile, the World Economic Forum’s 2025 digest says 39% of workers’ skills are expected to be transformed by 2030, with AI skills rising alongside creative thinking, resilience, leadership, and lifelong learning.

That combination is the signal: Humanity is being re-specified, not replaced: Humanity is going to get itself one giant promotion — from working to leading. Leading armies of AI agents doing the work.

The danger is not (only) job loss. It’s skill atrophy, shallow thinking, and handing over too much judgment because the machine sounds fluent.

The opportunity is the opposite: teach people critical thinking, taste, rigor, ethics, architecture, and the discipline to choose. And the result will be a world where more people can build and thrive.

AI is changing what “being useful” means.

AI accelerates cognitive work. It does not make it any less tedious. If you want the upside without the chaos, you still need the “boring” things:

architecture
product thinking
systems design
constraints
taste
deliberate choice

Not sequentially. In parallel. All the time.

That’s the real lesson from 14 months of building with agents: the machine can do more of the work than I expected, and it has made human thinking more critical than ever.

Inconvenient for people who expected a shortcut.

Excellent news if you are in it to build.

—

Jo Wedenigg is the founder of Apes on fire, where he builds human x AI collaboration systems for creative, strategic, and transformation work. He is the creator of Ape Space and focuses on turning AI into a partner for advanced thinking.

The agent discourse is starting to sound like a gym-bro conversation.

“Bro, your loop is too small.”

“Bro, your context window isn’t stacked enough.”

“Bro, add memory. No — m o r e memory.”

“Bro, agent rules don’t matter.”

“Bro, recursive language models.”

And sure—some of that is real engineering. Miessler’s “the loop is too small” is a fair provocation: shallow tool-call loops do cap what an agent can do. Recursive Language Models are also legitimately interesting — an inference-time pattern for handling inputs far beyond a model’s native context window by treating the prompt as an “environment” you can inspect and process recursively.

But here’s the problem: a growing chunk of the discourse is no longer about solving problems. It’s about reenacting our folk theories of “thinking” in public—and calling it progress.

If you squint, you can already see the likely destination: not AGI. AHI – Artificial Humanoid Intelligence: the mediocre mess multiplied. A swarm of synthetic coworkers reproducing our worst habits at scale—overconfident, under-specified, distractible, endlessly “reflecting” instead of shipping. Not because the models are evil. Because we keep using human-like cognition as the spec, rather than outcomes.

And to be clear: “more human” is not the same as “more useful.” A forklift doesn’t get better by developing feelings about pallets.

The obsession with “agent-ness” is becoming a hobby

Memory. Context. Loop size. Rules. Reflection. Recursion.

These are not products. They’re ingredients. And we’ve fallen in love with the ingredients because they’re measurable, discussable, and tweetable.

They also create an infinite runway for bike-shedding. If the agent fails, the diagnosis is always the same: “needs more context,” “needs better memory,” “needs a bigger loop.”

Convenient — because it turns every failure into an invitation to build a bigger “mind,” instead of asking the humiliating question:

What problem are we actually solving?

A lot of agent builders are inventing new problems independent of solutions: designing elaborate cognitive scaffolds for tasks that were never constrained, never modeled, never decomposed, and never given domain primitives.

It’s like trying to build a universal robot hand … to butter toast.

Our working hypothesis: Utilligence beats AGI

At Apes on fire, we’re not allergic to big ideas. We’re just allergic to confusing vibes with value.

Our bet is Utilitarian Intelligence — Utilligence — the unsexy kind of “smart” that actually works: systems that reliably transform inputs into outcomes inside a constrained problem space. (Yes, we’re aware that naming things is half the job.)

If you want “real agents,” start where software has always started:

Classic systems design. State design. Architecture. Domain-centric applications.

Not “Claude Coworker for Everything.” — more like: “The Excel for this.” “The Photoshop for that.” “The Figma for this workflow.”

The future isn’t one mega-agent that roleplays your executive assistant. It’s a fleet of problem-shaped tools that feel inevitable once you use them — because their primitives match the domain they are operating in.

Stop asking the model to be an operating system

LLMs are incredible at what they’re good at: stochastic synthesis, pattern completion, recombination, compression, ideation, drafting, translation across representations.

They are not inherently good at being your cognitive scaffolding. Models are much closer to a processor in the modern technology stack, than an operating system.

So instead of building artificial people, we’re building an exoskeleton for human thinking: a structured environment where the human stays the decider and the model stays the probabilistic engine. The scaffolding lives in the system — state machines, constraints, domain objects, evaluation gates, deterministic renderers, auditability.

In other words: let the model do the fuzzy parts. Let the product do the responsible parts.

If we must learn from humans, let’s learn properly

Here’s the irony: the same crowd racing to build “human-like” agent cognition often has the loosest understanding of human cognition.

Before we try to manufacture artificial selves, maybe we should reread the observers of the human state. Kahneman’s Thinking, Fast and Slow is still a brutal reminder that “how we think” is not a very flattering blueprint. We are bias engines with a narrative generator strapped on top. Is that what we want an artificial “problem solver” to mimic?

Maybe not. Maybe the move is not: “let’s copy humans harder.” Maybe the move is: define the problem first, then build the machine that solves it.

Because “more of us” isn’t automatically the solution. Sometimes it’s just… more of the problem. So instead of Artificial Humanoid Intelligence, let’s work on Utilligence: intelligence with a job description.

APES ON FIRE

Ignite Bold Ideas, Faster

What I Learned About The Value Of Human Work, After Months of Working With AI Coding Agents

Thesis 1: Clarity is kindness

Thesis 2: Vibe architecture is no architecture

Thesis 3: Creativity does not come from abundance

This is not a software story, but a civilization story

AI is changing what “being useful” means.

More Human or … More Useful?

The obsession with “agent-ness” is becoming a hobby

Our working hypothesis: Utilligence beats AGI

Stop asking the model to be an operating system

If we must learn from humans, let’s learn properly

Forge

Rapid innovation and brainstorming

Graph based idea management

Contexts to add depth

The tech inside the spark

Thinking bigger at scale

Where Innovation Takes Flight

Explore our vision

Ignite Bold Ideas, Faster

Thesis 1: Clarity is kindness

Thesis 2: Vibe architecture is no architecture

Thesis 3: Creativity does not come from abundance

This is not a software story, but a civilization story

AI is changing what “being useful” means.

The obsession with “agent-ness” is becoming a hobby

Our working hypothesis: Utilligence beats AGI

Stop asking the model to be an operating system

If we must learn from humans, let’s learn properly

Forge

Rapid innovation and brainstorming

Graph based idea management

Contexts to add depth

Beta 3 now in closed testing. Public Beta available this fall.

The tech inside the spark

Thinking bigger at scale

Where Innovation Takes Flight

Explore our vision