How to Write Clean Code Functions

I share my thoughts on the different sections of the book Clean Code by Robert Martin. In this article, I talk about what I learned about functions.

Messy Function Code

I wanted to start off similarly to how Martin started his chapter on functions with a poorly designed function followed by the refactored version. See if you can make sense of this code in three minutes or less.

Press to Show Messy Code

Refactored Code

It’s not super complicated once you figure out what it’s doing but it does have a lot going on. The function is attempting to color specific field name cells in a grid-based off of what the value is from the gridview and the User’s color qualifications from the database. Below is a refactored version of what you see above. I took out the lower level abstraction functions.

Press to Show Refactored Code

Small!

The first rule of functions: Keep it small.

The second rule of functions: It should be smaller than that.

Functions should hardly be 20 lines long. This keeps it simple and transparently obvious.

There’s no real research or a reference that can be pointed to when making these assertions but through the experience of other developers, it has been said functions should be small.

Blocks and Indenting

If, else, & while statements should be 1 line long and probably be a function call.

It adds documentary value because the function called within the blocks can have a nicely descriptive name.
Functions should also not be large enough to hold nested structures. The indent level of a function should not be greater than one or two.

Do One Thing

FUNCTION SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.

The problem with this statement is that it is hard to know what “one thing” is. The way to do it is describing the function in a brief TO paragraph:

TO SetOnTimeDeliveryCustomCellStyle, we check to see whether the fieldName is OnTimeDelivery and if so, we GetMetricDataToCheck from database…

If a function does only those steps that are one level below the stated name of the function, then the function is doing one thing. The reason we write functions is to decompose a larger concept (the name of the function) into a set of steps at the next level of abstraction.

Another way to know a function is doing more than “one thing” is if you can extract another function from it with a name that is not merely a restatement of its implementation.

Stepdown Rule

We want the code to read like a top-down narrative. Every function to be followed by those at the next level of abstraction so that we can read the program, descending level of abstraction at a time as we read down the list of functions.

Switch Statements

Switch statements are hard to make small. Even if it’s only using 2 statements, it’s longer than it should be. It’s also hard to make the switch do one thing. However, switch statements are not always avoidable so instead, what Martin suggests is to bury them in a low-level class through the process of polymorphism. This concept took me a little bit to figure out, so let’s see if we can explain this well.

There are a couple of problems with this function. The first is that it is large and will continue to grow as more letters are added. Secondly, it does more than one thing and we know that a function should have one responsibility. The worst one is that there will be an unlimited number of functions with the same structure. For example, each function might have:

createSubjectLine(Notification n, string message)

sendLetter(Notification n, Email emailAddress)

The solution that Martin comes up with for the switch statements is to bury the switch statement in the basement of an ABSTRACT FACTORY. The factory will use the switch statement to create appropriate instances of the derivatives of Notification and the various functions shall be dispatched polymorphically through the Notification interface.

The general rule for switch statements, according to Martin, is that they can be tolerated if they appear once, are used to create polymorphic objects, and are hidden behind an inheritance relationship so the rest of the system can’t see them.

Use Descriptive Names

It is hard to overestimate the value of good names. A quote by Ward Cunningham, inventor of Wiki: “You know you are working on clean code when each routine turns out to be pretty much what you expected”
Half the battle is choosing good names for small functions that do one thing. The smaller and more focused the function is, the easier it is to choose a descriptive name.
Don’t be afraid to make a name long. A long descriptive name is better than a short mysterious/difficult name.
Be consistent with the naming phrases, nouns, and verbs.
- Include…., include…,

Function Arguments

The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three requires special justification and even then shouldn’t be used.

Arguments are difficult. They take a lot of conceptual power. Readers will need to interpret it each time they see an argument. Whatever the argument is, it forces you to know details on it that might not be needed.

From a testing perspective, the more arguments, the more things you need to test for.

Output arguments should not be a thing. Functions are expected to have IN arguments and return OUT a value.

Flag Arguments

Flag arguments are ugly. It loudly proclaims that the function does more than one thing. Instead, split the function into 2 functions with renderForTrue() and renderForFalse(). (<–For example)

Dyadic Functions

A function with 2 arguments is harder to understand than a monadic function. For example: writeField(name) is easier to understand than writeField(output-Stream, name)

Though the meaning is clear for both, the first easily glide past your eyes, the second requires a short pause until you learn to ignore the first parameter… and that would eventually result in problems since we should never ignore code.

Triads

Functions that take three arguments are significantly harder to understand than dyads. The issues of ordering, pausing, and ignoring are more than doubled.
For example:
- assertEquals(message, expected, actual). How many times would you read message and thought it was expected?
- On the other hand, here is one that is not so bad: asserEquals(1.0, amount, .001). It still requires a double-take but it is worth taking and being reminded of equality of floating-point values.

Reduce the # of Arguments by Making Objects of Them

Consider the following:

Circle makeCircle (double x, double y, double radius);
Circle makeCircle (Point center, double radius);

Notice the second function call becomes a dyad versus a triad function.

Use Exceptions

Instead of throwing error messages, use exceptions.
Doing the try-catch statement is considered one thing, so have one function that is used to catch exceptions and the function that does the function inside it.

Have No Side Effects

Make sure the function only does one thing. Here’s another example.
checkPassword(String userName, String password)
- Since I don’t want to write out the whole function, take my word for it. Inside this function, there is a function call of “Session.Initialize()”
- This is the side effect of this function. A user could call it and not realize they would lose session data if called at the wrong time.
- One simple solution would be to call it checkPasswordAndInitializeSession, though this violates the “Do one thing”

Command Query Separation

Functions should either do something or answer something, but not both.
Either change the state of an object or return some information on the object.

Doing both often leads to confusion. Consider the following;

public boolean set(string attribute, string value);

This function sets the value of a named attribute and returns true or false depending on the success and if the attribute exists. This leads to this kind of strange code:

if (set("fullName", "Joe Smith"))...

Imagine from the point of the reader. What does this mean? It could mean they are checking that the attribute was assigned the correct value or it could be checking if the attribute exists or if the attribute was set to the value previously. It is unclear to the reader and it is hard to infer from the call.

Martin explains the best way to resolve this problem with this code:

if (attributeExists("fullName"){
     setAttribute("fullName", "Joe Smith");
....
}

Conclusion

When I write functions, they do not come out as clean-cut the first pass. After the atrocious code is written, I go back through and start refactoring to the specifications above the best I can. All of these ideas and rules take time to implement and practice. Remember, the functions are the verbs of the programming language with classes as the nouns. The goal of the programmer is to use their code to tell the story of the system, so the functions need to be written to fit cleanly together into a clear language to help with that storytelling.