This paper is 50+ years old and yet programmers are still making the same mistake of defining modules based on a flowchart, instead of modules that hide design decisions. I think that shows the youth (and perhaps lack of foundational training) of software engineering as a field.
But also the progress of the last 50 years: the KWIC index went from “a small system [that] could be produced by a good programmer within a week or two” to a tiny system that can be written in a couple lines using the standard library of Java (as I once saw a demo of).
Also: I recently wrote a blog post related to this paper and all of the initial test readers told me they had no idea what a KWIC index was and it needed more explanation. Instant full-text search really changes things.
Coincidentally, I reread the paper last weekend, and that side remark, about how the KWIC index would take a week or two to implement got my attention. So, I took the short specification given in the paper, and wrote it down in two dozen lines within a few minutes. I wrote it up as a blogpost just yesterday: https://holzer.online/articles/2025/03-05-kwic-quickly/
With a bit of cocktail napkin math that more than a factor of 100 faster than Parnas' estimate, just for having a higher language with a reasonable standard library at your hands.
Fred Brooks wrote in his 1986 "No silver bullet" essay that "There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.". - but the compound effects of many years of many, many gradual improvements in developer tooling are really not bad.
The paper came out in 1972 which is the same year that C was invented. I don't think the paper mentioned what programming language was used, but since it has function calls with parameters, it's a higher level language than assembly.
"The notation is mainly ALGOL-like and requires
little explanation. To distinguish references to the
value of a function before calling the specified function
from references to its value after the call, we enclose
the old or previous value in single quotes (e.g. 'VAL').
If the value does not change, the quotes are optional.
Brackets ("[" and "]") are used to indicate the scope
of quantifiers. "=" is the relation "equals" and not
the assignment operator as in FORTRAN."
I see this mistake in project planning as well: I'm asked to scope how long it will take to complete module #1, then how long it will take to complete module #2, then module #3, etc. but in reality they all end up going hand-in-hand but depend on some common work that's "invisible" to the planners.
> defining modules based on a flowchart, instead of modules that hide design decisions
ok - "modules that hide design decisions" implies OOP. In thirty years, OOP got applied to build giant systems, some that went too far. There has been backlash against OOP for real reasons, perhaps also ignorant reasons too.
Flowchart coding is straight from "structured programming" an advance from the single line of execution ASM style that sometimes created unintuitive and needlessly-complicated execution flow. IMHO there is nothing wrong with structured programming, or actually OOP when used appropriately.
No, "modules that hide design decisions" implies modular programming. I think schools should teach modular programming before object-oriented programming. Then this confusion wouldn't arise so often. For a small modular programming language (which also supports OOP), have a look at Oberon:
As a hammer to be used on every single problem, OOP is unbearable. As a tool used when appropriate, OOP is delightful and I get very mad when languages think that they are above implementing it.
Maybe I'm cynical, but I think most people probably comment without reading things all the way through, and in this case you can't just read the first paragraph and get the gist of the article. It takes a bit of time to digest the examples.
Software Fundamentals: Collected Papers by David L. Parnas is an excellent compendium of Parnas' papers. You will find more wisdom here than most books on "Software Engineering" being published today.
A paper every programmer should read and comprehend. Related to object capabilities/capability-based security: e.g. a network capability is a powerful way to control how a program can access the network. In fact a capability naturally is the interface for a module; the module is accessible only through the capability. Compartmentalizing programs by basic functionality, akin to mechanisms as opposed to policies, removes artificial layering and ensures that a piece of code actually achieves what it sets out to do in a logically minimal fashion. Information hiding (encapsulation) is the essence of both modularity and abstraction, giving library reusability, mockability and lowered state dependence (for testing!), robust security, lower cognitive load, easier rewritability, and more. All in the name of writing programs that achieve what they set out to do.
So basically functions vs objects that provide methods that operate on opaque data types, except this was written at a time when programming languages that offered modules, such as Modula-2, had yet to appear, although one could then, as now, implement them yourself in C.
To modern ears, the title of the article seems to falsely promise that it's about system design - criteria for decomposing a system into modules rather than on the benefit of information-hiding modules over collections of functions.
Take that with some skepticism.. everyone can handle the complexity that they wrote yeaterday. Of those, many will want to avoid extra work on behalf of others or their future selves..
This paper is 50+ years old and yet programmers are still making the same mistake of defining modules based on a flowchart, instead of modules that hide design decisions. I think that shows the youth (and perhaps lack of foundational training) of software engineering as a field.
But also the progress of the last 50 years: the KWIC index went from “a small system [that] could be produced by a good programmer within a week or two” to a tiny system that can be written in a couple lines using the standard library of Java (as I once saw a demo of).
Also: I recently wrote a blog post related to this paper and all of the initial test readers told me they had no idea what a KWIC index was and it needed more explanation. Instant full-text search really changes things.
Coincidentally, I reread the paper last weekend, and that side remark, about how the KWIC index would take a week or two to implement got my attention. So, I took the short specification given in the paper, and wrote it down in two dozen lines within a few minutes. I wrote it up as a blogpost just yesterday: https://holzer.online/articles/2025/03-05-kwic-quickly/
With a bit of cocktail napkin math that more than a factor of 100 faster than Parnas' estimate, just for having a higher language with a reasonable standard library at your hands.
Fred Brooks wrote in his 1986 "No silver bullet" essay that "There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.". - but the compound effects of many years of many, many gradual improvements in developer tooling are really not bad.
The paper came out in 1972 which is the same year that C was invented. I don't think the paper mentioned what programming language was used, but since it has function calls with parameters, it's a higher level language than assembly.
The full KWIC index program specification can be found in the original 1971 paper (reference [8] in the 1972 paper): https://kilthub.cmu.edu/articles/journal_contribution/On_the...
The code therein uses the specification language described in https://dl.acm.org/doi/pdf/10.1145/355602.361309 (reference [3] in the 1972 paper). From there:
"The notation is mainly ALGOL-like and requires little explanation. To distinguish references to the value of a function before calling the specified function from references to its value after the call, we enclose the old or previous value in single quotes (e.g. 'VAL'). If the value does not change, the quotes are optional. Brackets ("[" and "]") are used to indicate the scope of quantifiers. "=" is the relation "equals" and not the assignment operator as in FORTRAN."
Yeah! I liked your blog post :)
We can all be 100x programmers, if we look at a broader timespan.
I see this mistake in project planning as well: I'm asked to scope how long it will take to complete module #1, then how long it will take to complete module #2, then module #3, etc. but in reality they all end up going hand-in-hand but depend on some common work that's "invisible" to the planners.
You sort of have to go along with it if you want to estimate. The perfect plan and estimate only happens after the completion date.
There are so many unknown deps and unknown opportunities to optimise at the start of a project.
You find those out as you get in the weeds and as the ICs dream a 1am about their code and bring that idea in the next day.
... then came semantic search, and AI search, which just spits out the answer. I had not heard of KWIC either.
> defining modules based on a flowchart, instead of modules that hide design decisions
ok - "modules that hide design decisions" implies OOP. In thirty years, OOP got applied to build giant systems, some that went too far. There has been backlash against OOP for real reasons, perhaps also ignorant reasons too.
Flowchart coding is straight from "structured programming" an advance from the single line of execution ASM style that sometimes created unintuitive and needlessly-complicated execution flow. IMHO there is nothing wrong with structured programming, or actually OOP when used appropriately.
No, "modules that hide design decisions" implies modular programming. I think schools should teach modular programming before object-oriented programming. Then this confusion wouldn't arise so often. For a small modular programming language (which also supports OOP), have a look at Oberon:
https://www.miasap.se/obnc/oberon-report.html
> ok - "modules that hide design decisions" implies OOP
It doesn't. Implementation/design decision hiding also exists in procedural, functional and in other paradigms.
I didn't say "it demands OOP" I said "implies OOP" .. what you say is true also
> "modules that hide design decisions" implies OOP.
No; it is the other way around.
OOD/OOP implies information hiding/encapsulation.
As a hammer to be used on every single problem, OOP is unbearable. As a tool used when appropriate, OOP is delightful and I get very mad when languages think that they are above implementing it.
It still surprises me there hasn't been much discussion on this article here. The only three submissions that got any traction previously:
https://news.ycombinator.com/item?id=37477446 - Sept 2023 (15 comments)
https://news.ycombinator.com/item?id=30138468 - Jan 2022 (27 comments)
https://news.ycombinator.com/item?id=8849468 - Jan 2015 (5 comments)
Maybe I'm cynical, but I think most people probably comment without reading things all the way through, and in this case you can't just read the first paragraph and get the gist of the article. It takes a bit of time to digest the examples.
Software Fundamentals: Collected Papers by David L. Parnas is an excellent compendium of Parnas' papers. You will find more wisdom here than most books on "Software Engineering" being published today.
For a more recent text which touches on this see John Ousterhout's wonderful book:
https://www.goodreads.com/book/show/39996759-a-philosophy-of...
which has been discussed here previously:
https://news.ycombinator.com/item?id=42456492
https://news.ycombinator.com/item?id=43166362
The book's title is "A Philosophy of Software Design", for those not wanting to follow the link.
This is one of the articles that influenced me most. I did write two articles trying to tie it into some other books I've read on software design, the first being https://entropicthoughts.com/software-design-tree-and-progra...
I found this article much easier to digest than the paper. Nice one
One of my favorite ever papers and very ahead of its time. Thanks for the share :)
A paper every programmer should read and comprehend. Related to object capabilities/capability-based security: e.g. a network capability is a powerful way to control how a program can access the network. In fact a capability naturally is the interface for a module; the module is accessible only through the capability. Compartmentalizing programs by basic functionality, akin to mechanisms as opposed to policies, removes artificial layering and ensures that a piece of code actually achieves what it sets out to do in a logically minimal fashion. Information hiding (encapsulation) is the essence of both modularity and abstraction, giving library reusability, mockability and lowered state dependence (for testing!), robust security, lower cognitive load, easier rewritability, and more. All in the name of writing programs that achieve what they set out to do.
So basically functions vs objects that provide methods that operate on opaque data types, except this was written at a time when programming languages that offered modules, such as Modula-2, had yet to appear, although one could then, as now, implement them yourself in C.
To modern ears, the title of the article seems to falsely promise that it's about system design - criteria for decomposing a system into modules rather than on the benefit of information-hiding modules over collections of functions.
If you are working with state based behaviour a module is an object with a set of mutually exclusive states.
I noticed many programmers don't care about modularity. Some people can handle complexity better than others and don't see the need for it.
Take that with some skepticism.. everyone can handle the complexity that they wrote yeaterday. Of those, many will want to avoid extra work on behalf of others or their future selves..
[dead]