Code Smell : Array-based Data Model

In OOP, you have classes to define your data model. Your class can be passed between methods and even other classes as well, making it very flexible to be used, and at the same time still keeping the data structure. It is nice isn't it? Unfortunately, there are a group of programmers who like array-based data model more rather than modeling using classes.

If you find those group working for medium-high complexity projects, quickly demoted them, and exclude them from the socialization. If you find yourself doing the same thing, quickly go to nearest worship place and atone your sin, then begin to start a new life. Is it that bad? Yes it is. Why? Here we go:


Array Based Data Model

Using array, or in C# can be interfaced as IEnumerable, is not wrong. But it should do one and only one job, that is keeping some objects together. Generic array is the best to prevent the mistake meanwhile the non-generic one can be harmful. In weakly typed language like PHP, the matter come worse and more into the worst.

For example is a IEnumerable<IEmployee> or IEnumerable<IProduct>. It is fine because we know that it hold a specific class as data model. And they do not act as data model itself. Meanwhile Dictionary<string, object> can be used by "Lazy", evil programmer who want to overthrow the reign of class. IEnumerable<Dictionary<string, object>>, acted as those programmer's army to do it.

You Need to Look Back for Object Creation

The very first threat of Dictionary<string, object> army is their ability to conceal their members. We do not know the member composition until runtime. They can add/remove member and even change the object model as they wish.

Take for example the following code:

Dictionary model = new Dictionary(); 
model["first"] = 1; 
model["first"] = 1.1f; 
model["first"] = "hello world"; 
Console.WriteLine(model["first"]);

You can see that it's member named "first" can be either integer, float, or string based on assignment and can be changed on runtime. Assuming that you are using it as immutable data, you still need to reach the object creation step for defining it's member. Meanwhile when using class, you only need to look at it's class structure and you get the members. Do you want nightmare? Imagine the data member for that Dictionary is changed in 15-step method call stack. Add more call stack to make it scarier.

The Common C# Mistake

As programmer, you may aware with using the harmful Dictionary<string, object> and avoid to use it at all. However, the most common mistake in C# oop world is using DataSet, DataTable and DataRow as data model. Usually "Lazy" programmer just doing query inside the same method and process the threatening data model. This is often overlooked and mostly being acceptable behavior.

So what is the solution? Convert it immediately into class object, and return them. And ensure any other class only accept class definition objects. The only exception for any class to accept data row is the DataRow to class converter.

Conclusion

Array-based data model with casting (or worse in weakly typed language) is marked as code smell. You cannot get the member composition before runtime. Avoid it at all cost under exception for the converter class (from array-based to class-based data model), frequently used in DataRow. I haven't meet any other requirement to use array-based data model, except that the code is dynamically generated.

No comments: