Zero Comments: A Way To Write Better Code?

Searching for the ideal level of commentary in program source code ...

How many comments is the right amount of comments to include in your code?

In a previous article I suggested an extreme answer: every line of code should be preceded by a comment which explains something about that line of code.

In this article, I am going to describe an opposite extreme: don't write any comments at all in your source code.

Why Zero Comments Can Be Good For Your Code

The motivation for writing code without any comments is this: comments are something that you write because you failed to make the code as self-descriptive as it could be.

What often happens when you write code is: you write some code, and you make a half-hearted effort to choose good names and to structure the code nicely. But, then, despite your half-hearted efforts, you realise that the casual reader of your code might have some difficulty understanding everything that is happening in your code. So you add some comments to your code, which explain in English (or in some other non-programming language) whatever it is that needs to be explained.

But maybe, if you had tried harder, you could have chosen better names, and a better structure, in the source code, without adding comments.

So this is my proposal: don't write any comments, and force yourself to make the code itself be as readable and as self-explanatory as it possibly can be.

Three Possible Objections to Writing Code With Zero Comments

I can think of three objections that might be raised to this approach, and I will deal with each one in turn:

Background knowledge: it may be very difficult for anyone reading the code to understand it if they lack certain background information. One of the things that comments are for is to supply this missing background information for those readers who don't have it.
API documentation generation: javadoc, and its relatives such as pydoc (for Python) and rdoc (for Ruby), require a minimum of one descriptive comment for each class and each method.
Legal requirements: the code may be derived from other code with an open source license which requires attribution and duplication of the license in any derived code.

Background Knowledge

The two code examples that I have written so far to demonstrate the principle of zero-comment coding are Sieve of Eratosthenes and Caesar's Cipher. These are "Hello World"-type problems that are commonly used as examples, for example to show how to write programs in a particular programming language.

They make good examples, because many people with a technical background would already know something about them. (Also, they are a little bit complicated, but not too complicated.)

But if you were reading code for the Sieve of Eratosthenes, and you had never heard of the Sieve of Eratosthenes, and you had no idea what it was for, then reading the source code for an implementation of the Sieve of Eratosthenes with zero comments might be a bit of a struggle.

My proposed solution to this difficulty is to separate the requirement for background knowledge from any decision made about how many comments should be included in source code, by providing background knowledge in a separate document, which for want of a better name, can be called README. For example, this README for the Sieve of Eratosthenes code example.

Linearity

One advantage of putting background information in a separate document is that you can write it up with a linear order and structure which is natural for a human reader. There is no need to structure the information according to the structure of the code, and you can avoid what might be called "very very long comment at the beginning of the source file" syndrome (which is what happens when programmers realise they need to write a long linear stretch of prose giving background information about the problem that a class, function or method is designed to solve).

Code Notes

Even after you have written up the required background information for your code project in a separate document, there might still be specific things that you want to document in the source code, perhaps because something tricky is happening, or because there is some special optimisation.

To maintain a strict "zero comments" policy, I propose that even this sort of information should be added into the background document, in a special section called Code Notes. Both of the code examples given above contain Code Notes sections in their background README files.

API documentation

If you don't have any comments in your source code, then when it comes time to generate API documentation, there won't be any descriptions of classes and methods other than the names of those classes and methods, which, even if you chose your names really well, might not be quite good enough.

To get the full benefit of the discipline of writing "zero comments" code, while still satisfying the requirements for API documentation, I suggest the following process: write zero-commented code first, get it perfect, and then add in class and method comments required for API documentation.

And to allow for the possibility that further code development will happen (even if you wrote the code to be "perfect"), maintain the commented and uncommented code in separate branches in your repository.

For example, in my zero-comments examples on Github, I have a master branch, without any comments (other than an attribution comment), and a commented branch, with API documentation comments.

(An alternative is a zero-comments branch for the zero comments version of the code, and a master branch containing code with API documentation comments included, which might be a preferable branching scheme, given that one would want to consider the master branch to be the official "release" branch.)

Legal requirements

If there are certain comments that have to be in your code, for legal reasons, then probably you have to put them in, even in the "zero comments" version of your code, if that code is published publicly.

Neither of my code examples so far has an explicit open source license requirement, but the Caesar's cipher code is loosely based on an example I found on the internet (after Googling "extremely descriptive code"), so I have added a comment, in both the commented branch and the "zero comments" master branch, giving a URL for attribution.

And a link ...

Just before I posted this article to proggit, I saw this link: Without comment, which happens to be on a very similar theme. (If there's a difference, it's that I'm saying yes, there are good reasons why there do have to be some comments in the code, but maybe there are benefits to having a version of the source code that doesn't even have those comments in it.)

a blog about things that I've been thinking hard about