Library · paper

Beyond Human-Readable: Rethinking Software Engineering Conventions for the Agentic Development Era

Dmytro Ustynov
2026

Source: https://arxiv.org/abs/2604.07502

Full text: arXiv preprint

Ustynov takes a flat-footed but important observation and works it through with unusual patience: for sixty years, the conventions of software engineering — naming, design patterns, project layout, SOLID, logging formats, commit messages — have been optimized for one consumer, the human developer, with his limited working memory and sequential reading speed.

When an LLM-based agent becomes the primary reader and writer of code, the optimisation function changes.

Different constraints apply: token budgets, tool-call costs, context-window decay.

The paper's most arresting finding is a controlled experiment on log format compression showing that aggressive compression increased total session cost by 67% despite reducing input tokens by 17%, because the interpretive burden moved to the model's reasoning phase — a reminder that naive "smaller = cheaper" intuitions fail once the consumer is an agent.

From that empirical base he proposes a principle (semantic density optimisation), rehabilitates several classical anti-patterns, and argues for decoupling semantic intent from human-readable representation.

For product direction the useful move is not his specific prescriptions but the broader reframing: when the agent becomes the first reader, much of what we took for engineering virtue was actually an accommodation to a cognitive constraint that no longer binds.

Central argument

Ustynov argues that six decades of software engineering conventions—file splitting, abstraction depth, SOLID principles, logging formats—were optimized for human cognition and must be reconsidered now that LLM-based agents are the primary consumers of codebases. His central principle is 'semantic density optimization': eliminate zero-information tokens (boilerplate, ceremonial syntax, deep abstraction hierarchies) while preserving high-information tokens (descriptive names, type annotations, rich commit messages). A controlled experiment on log formats validates a counterintuitive finding: aggressive compression actually increased total session cost by 67% despite reducing input tokens by 17%, because it offloaded interpretive work to the model's reasoning phase rather than eliminating it.

Critique

The paper's experimental foundation is narrower than its architectural conclusions warrant: a single controlled experiment on log format token economy is used to underpin sweeping recommendations about SOLID principles, file structure, and anti-pattern rehabilitation. The author acknowledges the 'Lost in the Middle' phenomenon—where LLMs show degraded attention for centrally-positioned content—as a direct challenge to large-file consolidation, yet proceeds to recommend consolidation anyway without resolving the tension. This creates a risk that practitioners adopt the framework's confident taxonomy while the empirical basis for several of its claims remains, by the author's own admission, an open question.

Why it matters for product

For a CPO, the most actionable implication is organizational: if AI agents penalize deep package hierarchies and high ceremony-to-logic ratios, then team and architecture decisions that were justified by human cognitive limits—microservice granularity, strict layering, extensive DI frameworks—may now carry a hidden productivity tax in agentic workflows. The proposed CODEMAP.md artifact also points to a product governance question: who owns the semantic contract between a codebase and its AI tooling, and how does that surface in engineering team rituals and delivery metrics? As AI-generated code approaches 100% in some teams, product leaders need to reframe 'engineering quality' standards—currently built around human readability—before those standards actively slow down AI-assisted delivery.