Skip to main content

Notes on "Programming as theory building"

· 7 min read
Bruno Felix
Digital plumber, organizational archaeologist and occasional pixel pusher

Programming as Theory Building1 is an almost 40 year-old paper that remains relevant to this day. In it, the author (Peter Naur, Turing Award winner and the "N" in BNF2) dives into the fundamental question of what is programming, and builds up from that to answer the question about what expectations can one have, if any, on the modification and adaptation of software systems.

This paper offers a "Theory building view of programming" in which the core of a programmer's work is to create a "theory of how certain affairs of the world will be handled by, or supported by, a computer program". Such a theory informs the basic principles, abstractions and assumptions that are baked into a system's code, and thus the extensibility of a program is enabled or constrained by the degree teams are able to maintain a coherent theory for the system.

Naur adopts a definition of theory that is predicated on a distinction between intelligent and intellectual work - a theory being required for the later. Intelligent work on one hand is defined by the ability to do a certain task well (according to some criteria), correcting errors and learning from examples. Intellectual work on the other hand, is defined as the knowledge required to perform a task intelligently, but also be able to explain things, answer questions and critically appraise a particular solution/practice.

A programmer's work is therefore to formulate a theory of how certain processes and outcomes can be supported by code executing a computer. Equipped with a theory the programmer is able to:

  1. Have an understanding of the relevant parts of the "world" that are in scope for the system and how the program and its structure relates to the desired outcomes;

  2. Why is a program structured in a specific way and what are the underlying design principles/metaphors;

  3. How a program can best be extended by leveraging the existing abstractions and capabilities;

It's models all the way

An interesting nuance is that if software development requires a theory of how a program will achieve certain outcomes, that in turn requires a theory of the underlying problem - otherwise how to make an informed judgment about which facts or processes in the world are relevant to a particular technical solution?

This leads to an interesting observation that the quality of the mental model of the solution is directly correlated with the quality of the mental model one has of the underlying problem. And over time, the ability to gracefully extend a software system is directly correlated with the ability of new joiners forming a coherent mental model of the problem and the system, as well as understanding the mental models of the programmers that came before.

The life and death of software systems

"The program text and its documentation has proved insufficient as a carrier of some of the most important design ideas."

Having seen my fair share of different systems throughout my career, and even "inheriting" some in a less-than-stellar shape, makes me appreciate how systems may end up having an expiry date even if in their current shape they are still producing value. How many times have you looked at a code base with important but ultimately not very enlightening documentation (e.g. initial project setup and little else) and packages/modules named after generic technical constructs such as controllers or services? Without access to the people that originally created the system it is exceedingly hard to understand the rationale behind some decisions. Practices like ADRs3 may help, however there are limitations to the level of understanding that can be gained from documentation alone.

"On the basis of the Theory Building View the decay of a program text as a result of modifications made by programmers without a proper grasp of the underlying theory becomes understandable."

This has implications on some of the assumptions that go into the practice of software development. First and foremost, socializing the theories and metaphors that underpin systems should be seen as a worthwhile investment in a system's long term sustainability. The corollary of this idea is that a software system "dies" once the theory behind it is lost to its current operators. The system may be kept running, but further modifications will most likely result in a patchwork of one-off fixes. This in turn provides ample grounds to challenge the perception that software systems are extremely adaptable by definition. If the team owning a system no longer has a coherent theory for that system (and maybe even the underlying problem), any changes will probably be crudely bolted on top of the existing implementation and that system is well on its way to become a toxic asset, no matter how bright people are or how fast they type. In such circumstances it is worth considering if it is better to start from scratch (after all buildings get demolished all the time, why not software systems?). Such a decision should not be taken lightly, but in many cases it can be done iteratively4 and thus it is possible to avoid the most common pitfalls that constitute the "second system syndrome"5.

Relevance in today's world

None of this is new, after all the paper is almost 40 years old. Practices like XP have been around for quite a while and "Agile" became mainstream (and probably got body-snatched in the process), yet I think it is extremely relevant in the present context. The idea that the major cost driver in software systems is writing the code is sadly still very much alive. This is untrue and betrays an at best superficial understanding of what goes into building software systems, but nonetheless it has quite a bit of traction among managers and all sorts of expert beginners6. As we are speed-running through a generative AI bubble7, the premise behind a lot of "Gen AI enabled" tools or “AI developers” is that writing code will be dramatically cheaper and thus programmers can either be way more effective or even replaced altogether.

As this paper argues, that is a fundamentally incorrect perspective. The key activity in developing software is coming up with a theory of the problem, figure out a theory for the solution, socialize it, and keep it up-to-date as new stressors are discovered and the team and surrounding organization changes. For the time being that remains squarely a human activity.


Footnotes

  1. Programming as Theory Building

  2. BNF - Backus Naur Form

  3. For some notes about ADRs (Architecture Decision Records) and other "architecture" practices, my own article on the topic may be interesting.

  4. I strongly encourage you to think about how to gradually introduce new changes using techniques like the Strangler Fig Application

  5. Second system syndrome. I also recommend this article on software rewrites.

  6. How Developers Stop Learning: Rise of the Expert Beginner

  7. Some would probably call generative AI a revolution, and while it's not a complete cesspool of grift like crypto, the industry is vastly exaggerating on what it can reasonably achieve. And if one year ago almost no-one questioned the value of the investments in this area, nowadays the story is quite different. Some examples: FT, Fortune, Bloomberg