What does Clean Code meant to you?

The very basic question

It is the very basic question for middle-level programmer (professional programmer with advanced skill but not yet a master). It has already been discussed maybe for decades in several discussion forums. Some of the source I had found is:
However we arrived to the basic question, what is a clean code actually? This is purely my opinion about clean code.

The general aspects of clean code

To be defined as a clean code, I think that the code should has some of the aspects below:
  1. Readable / high readability
  2. High Cohesion
  3. Low Coupling
  4. Avoid Duplication as Possible
  5. Testable

Readable / High Readability

You write code to be read in the future, either by yourself or other developer. Any code cannot be maintained by anyone if it cannot be read. Any design is useless if the code not readable. That is why the first and most important part in clean code.
Now the question is, what makes the code readable? A code is readable if you can understand what it is doing in the first read. Compare both code below:
public DateTime A(int a){
    int b = B();
    return a - b;
}
With this one:
public DateTime GetWorkingTime(int hourOut){
    int hourIn = GetHourIn();
    return hourOut - hourIn;
}
From the comparison, you can see clearly what the second piece of code is doing in first read instead of the first one. That readability alone already save you much time to define what is the code doing.

High Cohesion

The wiki definition is: 
if the methods that serve the given class tend to be similar in many aspects, then the class is said to have high cohesion. In a highly cohesive system, code readability and the likelihood of reuse is increased, while complexity is kept manageable. 
That said, a module (a method in procedural or class in OOP), the operations that is executed inside must be similar (doing one thing and one thing only). There is one principle in OOP which goal is almost the same, that is Single Responsibility Principle, which said:
every class should have a single responsibility, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
So taken an example of the low cohesion code below:
public class SaleOrderSubmitter{
    public void Submit(SaleOrder sale){
        if(!sale.Items.Any(k=>k.Quantity > 0)){
            throw new InvalidOperationException("No Quantity Ordered");
        }
        using(SqlConnection con = new SqlConnection()){
            // sql code to get whether list of item is exists
            bool isAnyItemNotExists = bool.parse(dataRow["Result"].ToString());
            if(!isAnyItemNotExists) { throw new InvalidOperationException("Item does not exists."); }
            // insert item to database
        }
    }
}
The operation above is having 4 responsibility. First is checking whether there are quantity ordered, second is opening connection to database, third is validating whether there are any items not exist in database and fourth is inserting the item to database. Why it is becoming a problem? That is because each responsibility logic cannot be reused (quantity validation, open connection logic, item exists validation, insert to database logic).
Say, if one day you have a requirement for modification, that the order are able to be submitted without item exists validation, you will have a hard time to develop it. Worst, you will have duplicated code which is a bad score for maintainability (see point 4 - Avoid Duplication if Possible). So, what can you do to avoid it? The answer can be found at point 3 - Low Coupling.

Low Coupling

Low cohesion usually comes with high (tight) coupling. A coupling means that an operation is tightly related to other operation, in which they are hard to separate.
Taking example from the above code in point 2 - High Cohesion section, you can find that the submit operation are tightly coupled with 3 other operations: quantity validation, open connection (in this case it is coupled with SqlConnection class - in which very unrelated) and item exists validation. If you change the code to have higher cohesion, usually the coupling are reduced and you are reaching 2 objectives at the same time, Hooray!
Curious about how to decouple the logic? In OOP, this is one example of how to do it (I leave the procedural solution because I am not that familiar with abstraction for procedural programming):

Item Quantity Validation

The basic to achieve low coupling is to decouple a logic from other logic. Take example of the item quantity validation, which is trivial enough and easy to discover. You can make another class called SaleItemQuantityValidator which handle the quantity validation logic.
public class SaleItemQuantityValidator{
    public void Validate(SaleOrder sale){
        if(!sale.Items.Any(k=>k.Quantity > 0)){
            throw new InvalidOperationException("No Quantity Ordered");
        }
        // if you want to add another logic, say that no item which has 
        // negative value
    }
}
That class now has only one responsibility, and only one reason to change, that is if you want to validate the quantity of a sale order. Say you are thinking: "what if I want to add another logic such as no item with negative value?". You probably asked whether that logic will need to be decoupled into another class or not. There is, no exact answer and there are tradeoff between each other.
If you separate to other class, you has the flexibility but development overhead (now you has 2 classes). If you add the logic in the same class, you couple both logic into one, reduce flexibility in return of less overhead. I personally choose to have "70% clean", that is to keep the logic in the same class, but in separate method, in which I can decouple it easily to other class when I need to, and leaving the class as Aggregated Service or called Facade Service by Mark Seeman.

Item Exists Validation

In decoupling the item exists validation, the same issue arises with item quantity validation, whether to decouple the connection logic with item exists logic. The decouple process with connection logic can be done with unit of works and generic repository pattern. But because it is too complicated I will leave it coupled for now.
public class SaleItemExistsValidator{
    public void Validate(SaleOrder sale){
        using(SqlConnection con = new SqlConnection()){
            // sql code to get whether list of item is exists
            bool isAnyItemNotExists = bool.parse(dataRow["Result"].ToString());
            if(!isAnyItemNotExists) { throw new InvalidOperationException("Item does not exists."); }
        }
    }
}
A very straightforward validation.

Insert Logic

We use the same style with the Item Exists validation, in exchange that we do the data insert instead of doing validation.
public class SaleOrderInserter{
    public void Insert(SaleOrder sale){
        using(SqlConnection con = new SqlConnection()){
            // insert the sale order here
        }
    }
}

Tie them up

We have the logic separated to 3 other classes. Now we have to modify the SaleOrderSubmitter to use those 3 new classes.
public class SaleOrderSubmitter{
    public SaleItemQuantityValidator saleItemQuantityValidator = 
        new SaleItemQuantityValidator();
    public SaleItemExistsValidator saleItemExistsValidator =
        new SaleItemExistsValidator();
    public SaleOrderInserter saleOrderInserter =
        new SaleOrderInserter();

    public void Submit(SaleOrder sale){
        saleItemQuantityValidator.Validate(sale);
        saleItemExistsValidator.Validate(sale);
        saleOrderInserter.Insert(sale);
    }
}
The SaleOrderSubmitter now acted as a Facade Service which hold the logic required for the sale order to be submitted. The submit operation now has lower coupling with other operations. Moreover, other components can be easily replaced with their inheritance and each classes can be reused in other places.
To achieve lower coupling, there are many ways beside this design. There are constructor injection, decorators, CQRS and many other ways which I don't know yet.

Avoid Duplication as Possible

A term which is more suitable than this is "DRY" or "Don't Repeat Yourself". However many developers misinterpret the DRY principle and overusing it, making the code more complex than it should be. In reality, the duplication in DRY refer to the logic and not the operation. Take example, this piece of code:
public void Validate(Customer customer){
    ThrowErrorIfNull(customer.FirstName, "Customer First Name");
    ThrowErrorIfNull(customer.Address, "Address");
    ThrowErrorIfNull(customer.LastName, "Customer Last Name");
}
private void ThrowErrorIfNull(string value, string propertyName){
    if(string.IsNullOrEmpty(value)){
        throw new ArgumentNullException(propertyName + " is null");
    }
}
For myself, I consider that the method is already DRY. But it can be overused by using attribute and reflections that loop through the class's MethodInfo or FieldInfo in a foreach statement, eliminating the multiple call of ThrowErrorIfNull. Now your class can be reused because it is using reflection, neat huh?
The problem with that design, is that the design is too clever. It reduced readability of the code, giving you black box operations (via reflection) and learning curves for those who does not understand reflection. This design also violates the KISS (Keep It Simple Stupid) principle, in which you must not add complexity or feature which does not required yet.

Testable

Any code should be testable using a specified parameters / data given, and that tests itself should be simple enough (not complex). This is required, because when you find a bug in business logic and want to simulate it, without a good design you will have a hard time to replicate the issue.
For example if you need a specified data, such as user data which is a platinum member. In order to replicate the issue, you must first create / modify a user with platinum rank, and only after that you can begin to replicate the issue.
Taken the example of SaleOrderSubmitter code above. It is testable, assuming that the validator and insert method are virtual. You can create mock classes for the item validator class, to either throw the error or not (based on test), and doing test using that mock class. That is to ensure the SaleOrderSubmitter really doing it's job well (that it does not insert to database if item not exists).

Conclusion

A clean code has many benefits if followed. It gives you a more readable code (faster at understanding the code's behavior), flexibility and easier to modify (due to low coupling, the scope of modification will not spread to many classes).

No comments: