AI Made Code Cheap. The Bottleneck Is Now Understanding Systems.

Stephen CollinsMar 8, 2026
6 mins

Key Points

What does it mean that AI made code cheap?

AI tools like Cursor, Copilot, and Claude Code can now produce large amounts of working software from small prompts. Code generation is no longer the rate-limiting step in software development.

What is the new engineering bottleneck in the AI era?

System comprehension — understanding which parts of a codebase matter, where risk concentrates, and where to focus engineering effort. This knowledge used to live in tribal memory; now it needs tooling.

What is hotspot analysis?

A method that combines commit frequency and complexity to identify high-risk code: files that are both hard to understand and actively modified are most likely to accumulate bugs and technical debt.

What is codebase observability?

The practice of treating a codebase's evolution as something worth measuring over time — tracking where complexity accumulates, which modules are becoming unstable, and how AI-generated code changes system structure.

How does activity-weighted risk work?

Risk is approximated as change frequency multiplied by complexity. Files that change constantly and are hard to understand are disproportionately likely to cause problems.

This post was originally published on hotspots.dev.


Software used to be expensive to produce. Developers were constrained by how fast they could write code, which meant teams had to be deliberate about what they built. Every feature was an investment.

That constraint is eroding quickly. AI tools can now generate large amounts of working software from relatively small prompts. Cursor scaffolds full modules from a description. Claude Code produces working CLIs in minutes. Copilot fills in entire implementations as you type. One prompt can produce hundreds of lines of code — complete API handlers, even full prototypes.

Code production is no longer the bottleneck.

The constraint has moved somewhere else.

The Bottleneck Has Moved

When the cost of producing something drops dramatically, volume increases. Software is no exception.

As code becomes cheaper to produce, the hard problem shifts from writing to understanding. Developers increasingly spend their time asking different questions:

  • Which parts of this codebase actually matter?
  • Where are bugs most likely to appear?
  • Which modules are fragile?
  • Where should we direct engineering effort?

These are questions about the system itself, not individual files or functions. And they get harder to answer as systems grow, something AI accelerates.

The engineering bottleneck used to be code creation. Now it’s system comprehension.

Three Structural Effects of Cheap Code

When something becomes dramatically cheaper to produce, predictable effects follow.

Codebases grow faster. AI removes friction from writing new components. Developers spin up modules they previously would have avoided, scaffold services that once took days, and generate helper libraries on the fly. The number of files, abstractions, and moving parts increases. Systems get larger.

Systems change faster. AI also accelerates commit velocity. Refactors become cheaper. Experiments are easier to try. Iteration cycles compress. This is broadly positive — faster feedback loops are good — but it also means the codebase changes at a rate that can outpace a team’s ability to reason about it.

Architectural entropy increases. AI tools are optimized for local correctness — does this function do what I asked? — not for global coherence. Over time, the first two effects compound. Systems still work, but duplicated patterns, inconsistent abstractions, sprawling modules, and accidental complexity pile up.

What Existing Tooling Misses

Most developer tooling falls into two categories.

There are tools that help you write code: IDEs, autocomplete, and AI assistants like Copilot or Cursor. These help developers produce code faster. They’ve gotten remarkably good at this.

There are tools that help you check code: linters, static analyzers, type systems, security scanners. These validate correctness — does the code follow rules, does it compile, does it have obvious vulnerabilities.

What’s mostly missing is a third category: tools that answer the question what parts of the system matter most?

Today, developers answer this question through intuition and tribal knowledge. Someone on the team “just knows” that the auth module is fragile, or that the payments service has accumulated a lot of technical debt. That knowledge lives in people’s heads, doesn’t transfer well, and becomes harder to maintain as systems grow and teams change.

As AI accelerates codebase growth, relying on intuition becomes increasingly untenable.

The Meta Tool Layer

┌─────────────────────────────────────────────────────┐
│  Layer 3 - Codebase Intelligence                    │
│  Hotspot analysis · architecture evolution ·        │
│  system risk signals                                │
├─────────────────────────────────────────────────────┤
│  Layer 2 - Code Validation                          │
│  Linters · type systems · static analysis ·         │
│  security scanners                                  │
├─────────────────────────────────────────────────────┤
│  Layer 1 - Code Creation                            │
│  IDEs · Copilot · Cursor · AI coding agents         │
└─────────────────────────────────────────────────────┘

One way to think about this missing category is meta tools.

Where conventional tools analyze individual files — this function is too complex, this import is unused — meta tools analyze the codebase itself. They examine system structure, change patterns, and architectural behavior over time. They answer questions like:

  • Where does change concentrate?
  • Where does complexity accumulate?
  • Which modules are stable, which are volatile?

Meta tools sit above the code.
They don’t tell you how to fix a specific bug.
They tell you where to look.

Hotspot Analysis as a Concrete Example

One useful approach combines two signals already present in almost every repository: commit history and complexity.

Some files change constantly. Some files are extremely complex. When those two traits overlap, you have a problem — a function that is both hard to understand and being actively modified is a high-probability site for bugs, regressions, and compounding technical debt.

The core heuristic is simple:

risk ≈ change frequency × complexity

In practice, this surface area is small. In most systems, perhaps 5–10% of the codebase accounts for the majority of the structural risk. Identifying that slice lets engineering teams make better decisions about where to invest time — what to refactor first, what to review most carefully, where to add test coverage.

Hotspot analysis is one concrete example of a meta tool.

Hotspots applies this idea to any git repository, ranking functions by activity-weighted risk and surfacing the small portion of the codebase that dominates engineering effort. The goal isn’t to tell you your code is bad. It’s to give you a map.

Why This Matters More in the AI Era

AI tools make hotspot signals stronger, not weaker.

When developers iterate faster, commit velocity increases — and high-velocity files accumulate churn signals more quickly. When AI generates code that gets repeatedly revised, rewritten, or extended, those integration points show up clearly in the analysis. When architectural entropy increases, complexity metrics climb.

The problem hotspot analysis solves — identifying where risk concentrates — becomes more valuable as the forces that create risk accelerate.

Codebase Observability

The broader idea behind this goes beyond hotspot analysis.

Modern production systems have strong observability. We measure logs, metrics, traces, error rates, latency. We have dashboards for system health. When something goes wrong in production, we have tools to understand why.

We don’t have equivalent tooling for the codebase itself.

We don’t routinely measure how architecture evolves, where complexity accumulates over time, which modules are trending toward instability, or how AI-generated code changes the system’s structural properties. That information exists in git history and static analysis — it just hasn’t been assembled into something useful.

The next step is codebase observability: treating the evolution of software systems as something worth measuring, monitoring, and understanding over time. Not just “is the code correct?” but “how is the system changing, and where is that change creating risk?”

What Comes Next

AI is making code abundant.

System understanding is becoming the scarce skill.

The developers who will be most effective in an AI-accelerated world won’t just be the ones who can generate code fastest. They’ll be the ones who can reason clearly about systems — who understand which parts matter, where risk lives, and where to focus attention. Tools that help with that problem will become increasingly important.

Tooling for this problem is still early.
But it’s the right problem to be solving.

If you’re curious where risk concentrates in your own codebase, hotspots.dev is a good place to start.