One of my earlier functions was a true abomination.
It was 250 lines long, accepted 18 input parameters, and was nested up to 9 levels deep with highly complex conditions at all levels. It’s not only gross; it’s horrible, and I wish it would just go away. But it does its job, and I would need to put in significant effort in restructuring the whole program if I wanted to do something about it. And you can bet I’m not touching it now, not with a ten-foot stick.
The worst part? It’s the central cog in an equally abominable class (written entirely by yours truly) producing these plots, which are still used during scientific rocket launches from Andøya to determine the optimal moment for pressing the big red button.
As I look at it now and shudder, it is clear to me that being able to code doesn’t make you a software engineer. Not that I would consider myself one yet. But even experienced software engineers aren’t perfect. Everyone who’s done a bit of programming has stumbled upon bad code. Variable names are unclear, hiding their intention instead of revealing it. Function names leave you surprised when you discover what the functions do. Methods may be long, do many things at once and perform operations across several layers of abstraction.
Clean Code: A Handbook of Agile Software Craftsmanship is a book by Robert C. Martin (aka “Uncle Bob”) which recognizes that unclear code is in fact an industry-wide problem with real downsides both to programmers and to the business side of things. The book sets out to provide a set of fundamental principles and values which can aid us in creating the digital work of art that clean code certainly is.
I’ve benefited greatly from reading this book. In fact, without it I might not have been offered the position I recently accepted. While I’ve done a lot of coding during my PhD, even some proper development of Python packages all the way from feature design to deployment from continuous integration testing servers, my focus has been on the scientific side of things. I’ve certainly chosen some unclear variable names and written my fair share of functions that should have been split into at least ten smaller ones.
As my PhD progressed, I discovered that programming was my real passion and that software development was what I really wanted to do for a living. I knew then I had to improve my coding practices. Clean Code was the first pure programming book I read. Its main target group is professional developers, but everyone who writes code regularly (and can read Java, which is the language of the code examples in the book) can benefit from it. You shouldn’t blindly follow everything the book says – the contents are explicitly stated in the start of the book to be personal preferences – but it does contain a solid amount of sound advice and valuable guidelines.
The cost of owning a mess
The author starts by calling attention to the often underappreciated cost of owning a mess. A sad and allegedly common story goes like this: With messy code, you have to spend time understanding the intricacies of the code so that you can add more layers of intricacies. Every change you make breaks the code in three other places. Productivity plummets. Management throws more developers to the project, which is as effective as extinguishing a fire with gasoline. The team rebels, and the best are selected for a complete redesign. Years pass, the redesign team is gradually replaced, and the new developers require another redesign because the first redesign is now yet another mess. Meanwhile, the original project is still trudging along, sinking ever deeper into a molasses of bad code and time-consuming maintenance, because it can’t be replaced before the new version can do everything the old one can.
This scenario is avoidable. Sound principles of how to structure and write classes and methods, when combined with liberal refactoring (made safe by all the unit tests you’re already writing, right?), makes it possible to keep the code clean. Code can almost always be improved, and if you follow the boy scout rule of always leaving the campsite a bit more tidy than how you found it, you’ve come a long way.
Clean Code is basically that: Hammer into the mind of the reader that you should always leave the code a bit cleaner than how you found it, and present a set of principles detailing what “clean” actually is and how you can achieve clean code.
So what is clean code?
That’s what Uncle Bob asked several big shots in the software industry, from Bjarne Stroustrup to Ward Cunningham.
Clean code should make it hard for bugs to hide. Clean code does one thing well. It’s elegant, simple, and direct, and reads like well-written prose. It looks like it was written by someone who cares. There is nothing obvious you can do to make it better. Clean code runs all the tests. (You have tests, right? Of course you have.) And code is clean when every routine does pretty much what you expected.
That’s certainly some inspiring takes on what clean code is, but it only tells you how to recognize it, not how to write it. That’s what the rest of the book is about. The fashion in which it goes about this is neatly summarized thusly:
Just like a book on art, this book will be full of details. There will be lots of code. You’ll see good code and you’ll see bad code. You’ll see bad code transformed into good code. You’ll see lists of heuristics, disciplines, and techniques. You’ll see example after example. After that, it’s up to you.
The book then proceeds to detail the principles of clean code. I’d like to highlight the two aspects I have profited the most from myself: Meaningful names, and how to structure functions and methods.
Meaningful names
Often, the code in itself is not very expressive. Names of variables and functions are one of the primary ways we can communicate the intention of the code. If a name requires a comment, it does not reveal its intent. Sure, the two declarations below are understandable in the declaration context, but which one would you rather see in the rest of the method or class?
int d; // elapsed time in days int elapsedTimeInDays;
Using the latter avoids what Uncle Bob refers to as “mental mapping”: it doesn’t require you do remember what d was supposed to be.
As an example of names which does little to reveal their intention, consider this function:
public bool won() { return Math.Abs(nWon(pla) - nWon(plb)) > 5 - g.num; }
Do you see immediately what it does? What is the significance of 5? What is g.num
? What is pla
and plb
? You can probably figure it out, but why waste your (and other people’s) time making the code hard to understand?
In contrast, this should be much clearer:
public bool isMatchWonWhenBestOfFiveGames() { int numberOfGamesFinished = games.Count; int numberOfGamesRemaining = 5 - numberOfGamesFinished; int lead = Math.Abs(NumberOfGamesWon(Player.A) - NumberOfGamesWon(Player.B)); bool losingPlayerCannotCatchUp = lead > gamesRemaining; return losingPlayerCannotCatchUp; }
I’m not saying it’s perfect, but it’s certainly better. When the time comes to make some updates to your code, everyone will thank you. Sure, the former function is shorter, but brevity is never an excuse for obscurity.
The book, of course, goes much deeper than this. I’ll leave it to you to read it for yourself.
Functions
Uncle Bob’s three most important rules for functions are these: One, they should be small. Two, they should be smaller than that. Three, they should do one thing, and one thing only. The “one thing” maxim includes sticking to one level of abstraction inside the function (which is far easier if the function is small). And of course, following the previous chapter on names, everything should be so clearly advertised in the function name that after reading it, you should never be surprised when you read the function body.
Consider this function for ending a match:
private void EndMatch() { // update GUI scoreButtonA.Enabled = false; scoreButtonB.Enabled = false; newMatchButton.Visible = true; newMatchButton.Focus(); // get winning player name Player winningPlayer = match.NumberOfGamesWon(Player.A) > match.NumberOfGamesWon(Player.B) ? Player.A : Player.B; string winningPlayerName = winningPlayer == Player.A ? textBoxPlayerA.Text : textBoxPlayerB.Text; // show which player has won MessageBox.Show(winningPlayerName + " won!"); }
A very helpful thing you can do is to set the word “to” before the function name, and then explain all the steps. Here, it’s quite a mess:
- To end the match, disable score button A, then disable score button B, then show the “new match” button, then focus the “new match” button, then get the winning player, then get the winning player name from the correct text box, then show a message box with a message telling which player has won.
This mixes several layers of abstraction. It becomes more clear if you try to think about this with a fresh mind. What, at the highest level, are you really doing when you’re ending a match? How about this:
- To end the match, disable the score buttons, then show and focus the new match button, then notify which player has won.
But how do you disable the score buttons, etc? Let’s answer that too, as high-level as we can:
- To disable the score buttons, disable score button A, then disable score button B.
- To show and focus the new match button, show the new match button, then focus it.
- To notify which player has won, get the winning player name, then show it in a message box.
- To get the winning player name, get the winning player,1 then get the player name from the correct text box.
And lo and behold, the list above shows how our functions should be:
private void EndMatch() { DisableScoreButtons(); ShowAndFocusNewMatchButton(); NotifyWinningPlayer(); } private void DisableScoreButtons() { scoreButtonA.Enabled = false; scoreButtonB.Enabled = false; } private void ShowAndFocusNewMatchButton() { newMatchButton.Visible = true; newMatchButton.Focus(); } private void NotifyWinningPlayer() { string winningPlayerName = GetWinningPlayerName(); MessageBox.Show(winningPlayerName + " won!"); } private string GetWinningPlayerName() { Player winningPlayer = match.GetPlayerWhoWon(); return winningPlayer == Player.A ? textBoxPlayerA.Text : textBoxPlayerB.Text; }
That’s a bit clearer than the first example. We can dial it both back and forth if we want, but this is sufficient to get the point across.
If you find yourself sectioning off a function with comments to tell what the different sections do, like in the first example above, then you’re usually doing more than one thing or mixing layers of abstraction (or both). Writing several smaller functions that each does its own simple thing makes the code easier to maintain, because each function has a clear responsibility and only one reason to change.
Of course, it’s not as simple as that in practice. Another principle of Clean Code is to keep your classes as small as possible. Keeping classes small contradicts the proliferation of methods that you get when splitting up larger functions into smaller ones. In the end, it’s all about balance.
Keep your brain running and you’ll benefit from it
There’s loads of helpful principles and examples in the rest of the book too. Uncle Bob has a lot to say about comments (as few as possible, since the code and the names in most cases should be self-explanatory), formatting (which mostly doesn’t matter – as long as you’re consistent with yourself and the rest of the team, you can place your parentheses wherever you want), objects and data structures, error handling, code boundaries and interfaces, unit tests, et cetera.
No book is perfect, though. There’s a chapter on concurrency which feels a bit out of place, and of course, no principles should be followed blindly. Use common sense – the book doesn’t present hard truths, merely opinions from a highly experienced software developer. Granted, they’re quite sound opinions, but the pragmatist in me sees that if the principles presented in the book are to be taken to their extreme in all situations, the code might suffer. For example, the “boy scout” rule of always leaving the code a bit cleaner than you found it can result in refactoring for the sake of refactoring, which might not always lead to an improvement.
It goes without saying for anything, really – don’t shut off the critical part of your brain.
Still, I don’t doubt that more or less every software developer in the world could gain at least something from reading this book, and I heartily recommend it. If everyone read it, the world of code would certainly be a better place. Better software would be written, time and money would be saved, and Peter Welch wouldn’t have to write Programming Sucks. Which is a brilliant piece you should go read right now.
Cover image from sterlingsheehy.com.
We were clever and added a method for doing this to the class representing a match.↩
Interesting stuff! I wish there were similar books for me to read in the field of chemistry. :-P
I wished for that in physics, too – so I switched to programming. ;)