Exploration and Exploitation in Organizational Learning
The classic paper on the tension between exploring (trying new things, experimenting, searching for alternatives) and exploiting (optimising what already works, refining, executing).
Organisations need to do both but tend to fall out of balance.
Push agentic AI into the centre of the organisation and it is forced to explore on every front at once — each team can try more, faster, which strains the mechanisms of exploitation and coherence.
March gives the language to talk about that tension without collapsing it into a binary.
Central argument
March argues that organisations must balance two fundamentally different modes of learning: exploration — searching, experimenting, taking risks with uncertain returns — and exploitation — refining, executing, and optimising existing competencies for reliable near-term returns. The core tension is adaptive: exploitation tends to crowd out exploration because its returns are faster, more certain, and more visible, creating a self-reinforcing trap where organisations become increasingly efficient at things that may become obsolete. March shows this is not a failure of management but a structural property of how organisations learn and allocate attention, making balance a persistent systemic challenge rather than a solvable problem.
Critique
March's model treats exploration and exploitation as operating within a relatively stable organisational boundary, but says little about how the unit of analysis shifts when capabilities are distributed across ecosystems, platforms, or — now — AI agents acting semi-autonomously. The paper also underspecifies the conditions under which an organisation can deliberately shift the balance: it diagnoses the trap compellingly but offers limited prescriptive traction for leaders trying to engineer a rebalance under competitive pressure. A deeper treatment of power, incentives, and political economy inside organisations might explain why the imbalance persists even when leaders correctly diagnose it.
Why it matters for product
For a CPO, the exploration–exploitation lens maps directly onto the structural tension between discovery and delivery: product teams under OKR pressure systematically over-exploit because quarterly metrics reward shipping over learning, making exploration feel like waste rather than investment. The framing becomes especially sharp in decisions about team topology — dedicated platform or growth teams tend toward exploitation while embedded research or 0-to-1 teams hold exploration capacity, and cutting the latter first in a downturn is precisely the trap March describes. As AI tooling accelerates the pace at which each team can run experiments, the bottleneck shifts from generating options to integrating and consolidating what is learned, which is an organisational design problem March's vocabulary helps name precisely.