The myth about code comments

It's still a very common belief that code comments make code readable and clean.

June 27, 2023

Last updated on June 27, 2023 20:53:14 UTC

Every time I integrate with a new software engineering team, eventually I will have to debunk this common myth about code comments: they are a good way to add context and understanding to your program. Years ago, myself and a few colleagues were studying some well-known software engineering curriculum, and we all came to the conclusion together than code comments are not the best way to add context to your program. Since then, pretty much every group of software engineers I've worked with reacts to this idea somewhere between skepticism and outright rejection. The basic premise is this: the only thing that can be truthful is the code. Comments are not reliable and never have been. So why do so many developers think code comments are still a good idea?

For a long time, computer science programs would teach that code comments are important for communicating your intent with your program. A lot of people I have worked with who have computer science degrees had this practice ingrained into them, because they would get worse grades if there were no comments. Some languages you absolutely need to write comments because the code can't be understood without them. There is a recent swing back in these programs to teach clean code best practices formally. The University of Illinois here in my hometown has a class in their computer science program that's intent is too teach clean code. This is good, but many programs are still way behind in this area.

The first time I was exposed to this idea was Uncle Bob. But since then I've seen others advocate for it too. The reality is that when you really need to know what a program does, you read the code. You can't trust external information -- docs, READMEs, diagrams, confluence pages, and especially comments. We have all read a README that was out of date and didn't include some key setup step so we had to read the code line by line to find the issue. We have also all read comments that have rotted and contained misleading information. Code always does what it says. Comments on the other hand can lie for a number of reasons. Comments often are written once and never updated. They can contain non-local information; meaning they have information about another function in another file. A developer could change what goes on in the referenced file and that comment will then contain misinformation. And there would be no way for that developer to know to update that comment. Tools that we have today, like IntelliSense and static analysis, have no way to validate the accuracy of a code comment.

And that only includes the examples of a comment being harmful. It would be worth avoiding if only for that, but most of the time comments are simply noise. They restate the code and add nothing of value. There's a reason why so many themes make comments into a muted gray.

A meaningless code comment that restates the function name.

There's that really good quote by Grady Booch: "Clean code should read like well-written prose." We can write code that is expressive and understandable with only the code itself. And that's really powerful, because when we really want to know how something works, the code is all we can trust. Sometimes developers can be apprehensive about writing long names to variables and functions. I used to write everything in Vim, and yeah, I hated long function names. But we live in an age of powerful IDEs with indexing and refactoring tools; long names are not a burden anymore. I understand that a lot of developers see a long name and it feels "wrong". I feel the same tinge sometimes, because it feels off, unfamiliar. But I have gotten used to it now, especially with all the times I have been thankful for reading a long, descriptive function name that gives me all the context I needed.

And I'm not saying that I never write comments. I definitely write comments. It's that when I do, I know I could have done better. I wasn't able to communicate well using only the code. I will try to refactor the code so it is readable on its own. And sometimes you can't quite make it. There's always exceptions like a cryptic regular expression that you'll just want to leave a comment on because the code itself isn't expressive. The policy I advocate for is never "nuke all comments." It's more about achieving the goal of "well-written prose". There's also plenty of cases where comments are simply mandatory, like the public API in a package. Sometimes you need to include an author name and copyright. Nothing you can do about that.

With those concessions aside, comments are the last, most desperate measure to add context to a program. This is usually the hardest point to get across. Developers I've worked with will usually come around, but they don't want to give up comments completely, believing that they can still be useful in many cases. My challenge is to do as much as you can with your code without them. You might be surprised what you can do with some git savvy, using an IDE (instead of VSCode), and some good names, interfaces, and classes. That result is always going to be cleaner and more expressive than any comment you could have written. And so, with those concessions aside, my point remains: comments are not the best way to add context to a program, code is.