No Comment
During our daily standup the other day, one of the developers was complaining about the lack of commenting in a module that was written by a 3rd party consulting firm. This got many frustrated nods from the other developers in the group, which in turn led to a discussion of comments overall in our code. There seemed to be a general consensus among the group that the commenting in our code needed to be improved overall.
The reason I found it interesting (and by interesting I mean to me only) was the the timing of it all.
I’ve been on a kick lately to read nothing but Agile books (instead of the usual books on specific languages or technologies). The one currently occupying my bedside table is a book by Robert C. Martin (aka “Uncle Bob”) called Clean Code : An Handbook of Agile Software Craftsmanship.
Literally only a day or two before this particular stand-up meeting, I had read the section on commenting in the book. In fact, Uncle Bob’s views on the subject were so strong that he dedicated an entire chapter to it. He argues that comments overall are a “necessary evil” and are only needed as a means “to compensate for our failure to express ourself in code.” He hopes someday that languages evolve to DSLs that are self-documenting.
He reasons that comments are bad because Agile code changes over time. Oftentimes, the comments don’t get maintained along with the code. So the more code gets changed, the comments become further away from the current intent of that code. In fact, he boldly states that comments lie (though not necessarily intentionally), and that inaccurate comments do more harm than no comments at all. A bad comment is worse than no comment (maybe Uncle Bob was a lawyer in a former life?).
Well, if that’s the case, why bother commenting at all?
He does give some examples of useful comments:
- Legal comments, such as copyright headers, author statements, or ownership rights
- Warning of consequences
- TODO comments, which can be caught in IDEs as warnings
On the other hand, he provides numerous examples of bad comments:
- Documenting a “hack”
- Comments that mirror the functionality and/or naming of the code piece it describes
- Poorly written comments, which can be harder to read than the code itself
- Journal comments that log each time the code is changed
- Noisy comments that provide no value (e.g. // ignore )
- Position markers (e.g. //////// Public Methods /////////////)
- Closing brace comments (e.g. // end while)
- Commented out refactored code (a personal pet peeve of mine)
I don’t know if I agree with the extremist approach of removing all comments from my code. But I do agree that comments are often abused as a replacement for good code. The rule of thumb I came away from the book is: if you feel the need to document a complex chunk of code with a comment, that is a sure-fire sign of a code smell. Step back and look at the code from an maintenance perspective. Is this code likely to change? What are the odds the comments will change with it? Is the comment articulating my intent well? Have I named my variables/methods/classes intuitively? Most importantly - can I refactor the code into a standalone method that can be named expressively?
Comments can be useful, but many times end up being the poor man’s excuse for lazy coding. Don’t be that guy (or gal). Strive to make your code readable without having to resort to comments.
Friday, October 17, 2008 8:54:34 PM (Eastern Standard Time, UTC-05:00)
Agile | Back To Basics | Programming
Favor Composition Over Inheritance
I've come across all kinds of architectures in my time. Some have been good, some bad (yet somehow the bad vastly outnumber the good). Even on some of the good architectures I've seen, a common problem I've found is the abuse of object inheritance.
It's tempting to create an inheritance hierarchy to quickly get all the functionality provided by a common base class. Through one little symbol or keyword, you can easily extend your object to get all the functionality provided by some other class. However, be aware this kind of modeling comes with a price, and many times it's a price more steep than you should be willing to pay.
Which is why a lot of people much smarter than me - Josh Bloch, Venkat Subramaniam, Andy Hunt, Dave Thomas, Robert "Uncle Bob" Martin, and Steve McConnell - have all argued that using composition instead of inheritance creates a much more Agile object model. There are many reasons why Agile developers feel this way.
Conceptually
"Is-a" vs. "Has-a"
In inheritance modeling, class inheritance should only be used to represent true "subtype" relationships. Every subclass should be a subtype of the class it is extending. Object B should only extend Object A if a logical "is-a" relationship exists. In other words, you need to ask yourself, "Is every B really an A?" If you cannot truthfully answer yes to this question, there is no reason for B to extend A other than simple convenience. And convenience is not a good reason for creating an inheritance hierarchy.
Steve McConnell explains further in his seminal book Code Complete,
"If the derived class isn't going to adhere completely to the same interface contract defined by the base class, inheritance is not the right implementation technique."
Rather, you are better served modeling a "has-a" relationship between the two classes, where A is a private instance in the implementation details of B. B may depend on the services provided by A, but is not a logical subtype of it. By "wrapping" the reference to A, you will have a more conceptually accurate and flexible implementation.
It's all about the context!
A consumer of an object should be able to intuitively decipher its purpose from its public API. When you force an inheritance hierarchy of two objects that don't have an "is-a" relationship, you have created an subclass object that contains methods, properties, fields, etc. from two different contexts. By combining these contexts, you have created a confusing API that causes the consumer of it to have to actively track down the meaning behind methods exposed from the two contexts.
For example, suppose I have an Account object that extends TransactionCoordinator. The two objects have two different contexts, and the design of their individual APIs most likely reflect that fact. As a consumer of the Account object, when I see methods such as Start(), Commit(), Rollback() that have been inherited from the base class representing a different context, I am confused. Intuitively, an Account is not a TransactionCoordinator, and I would have to know that inheritance relationship even exists to understand why I am seeing these methods. Normally, I wouldn't expect to see them there. What do these methods mean? What do they do? If I'm lucky, I can look at the API documentation (if it exists) on the Account object and it will detail each method and its functionality and implications when called from the subclass. More likely, I will have to open the Account object up, see that it extends the TransactionCoordinator, and go there to research what each method actually does. And only then can I have the ability to decide if that behavior is something I want to invoke on my subclass.
Here’s the lesson - don't make consumers of your API work any harder than they have to.
Liskov's Substitution Principle
Barbara Liskov wrote one of most ground-breaking papers on object-oriented programming in 1998. In this paper, Liskov proposed that you shouldn't inherit from a base class unless the derived class truly "is-a" more specific version of the base class. In what became known as the "Liskov Substitution Principle", Liskov argued that any derived class must be able to be completely substituted for its base class interface without any knowledge of the calling class. In other words, subclasses must only change the implementation and not the meaning or intent of the base class interface.
Say you have an abstract base Account object. Callers that reference the base class API should not have to know or care which whether a CheckingAccount or SavingsAccount derived class actually provides the implementation. To take the example further, suppose there is a CalculateInterest() method declared on the Account object. In normal implementations it calculates the interest due to the consumer. Suppose you introduce a BankLoanAccount which changes the implementation to calculate the interest for the bank. This class violates the LSP since it changes the meaning of the implementation of the CalculateInterest() method, so now the caller cannot freely call the method on the base class reference.
Single Responsibility Principle
I'm a firm believer in the Single Responsibility Principle. Classes should have one responsibility, and one responsibility only. Think of it as the old Henry Ford assembly line - classes should do their one thing and do it well. There is no reason for them to know the inner details or the workings in the classes in which it interacts. This leads to smaller, more focused classes. And small, purposeful classes are A GOOD THING.
Single responsibility creates classes that are highly cohesive, while limiting coupling to external dependencies. High cohesion and low coupling is an admirable goal in any software architecture.
Planning for change
By limiting classes to one responsibility, you are also creating a class that has only one reason to change. As a developer who believes in Agile methodologies, I know change is going to come. Limiting the reasons for classes to change, as well as the surface area in which that change will be felt, will go a long way towards making those changes easier to implement.
Technical
Introduces Dependencies
Anytime you inherit from a class, you are explicitly creating a dependency on that super class. The code to the super class may change over time. Each time it does change, it can potentially directly or indirectly break some or all of its subclasses, even though the code in subclasses may not have changed at all. Most likely, the subclasses had no idea the change even occurred at all. And unless you have planned for this contingency in your unit test (you are writing unit tests, right???), you may not even find this breaking change until after the code is deployed. And if you are writing unit tests, that's just more (seemingly avoidable) unit tests you need to write to handle these cases.
This creates a lot more work for the API producer. All subclasses now will have to be designed (and tested) to evolve in conjunction with their super classes. That's more code and more unit tests you now have to write and maintain.
Fragile classes
Due to rules we must live with when choosing a static, compiled language, we limit ourselves in the structuring of our API. Anytime we define a method signature in any class, this signature causes a ripple effect up and down the inheritance hierarchy. That's both good and bad. Each base class automatically inherits the new functionality, but guess what - it may not want to do so. Or worse yet, it can break existing functionality in the subclass.
For example, introducing or changing a method signature in the base class will flow down to every class that derives from it. What if that subclass already has a method defined in that class with the same name? This will cause either: a compilation error if the method cannot be overridden; or worse yet unexpected behavior at runtime if the method can be overridden as the subclass implementation is now unknowingly invoked. Or, what about if the subclass has the same method name but a different signature? You will be forced to change your API in one of the classes just to get them to compile. The same problems manifest itself as methods are introduced in base classes, but at least you are a probably little more aware of the potentially breaking changes.
Think this is a far-fetched example? I don't. Imagine common method names like Execute(), Activate(), Register(), or IsAvailable(), Open(), Load(), or Save(). All of these are all methods names that are prolific in our object models, and could easily have much different signatures as well as potentially meaning very different things in different contexts.
What if the behavior introduced in a base class implementation is not desired in the subclass? Sorry, your S.O.L. Good and bad, you get everything that comes down from the base.
Refactoring
Due to all the problems with carrying "API baggage", classes with deep hierarchies are much harder to refactor. Are you completely comfortable with the API provided by your super class? Because anytime you extend it, any flaws/issues/concerns you have with it are propagated down to the subclasses. Not only that, the flaws now become further spread across your object model through the expanded surface area of all of its own and now its subclasses consumers. Composition is much more flexible. It has the advantage of masking, improving, or changing the API to encapsulate the existing concerns.
Not only that, but now you have tied yourself to this mis-matched inheritance hierarchy, and it will be much harder to introduce a new base class if a new valid one ever presents itself.
Summary
Inheritances is one of the most powerful and most abused concepts in object-oriented programming languages today. True, inheritance is an easy and useful way to provide quick code reusability across your object model. But the drawbacks of doing so far outweigh the short-term gains achieved. Oftentimes the reusability gained comes as the expense of code extensibility, maintainability, and testability. In the long run, ironically both producers and consumers of the API will have to do more work to maintain the fragile inheritance hierarchy. Favor composition over inheritance to limit fragile inheritance hierarchies and create a much more flexible and powerful API.
Wednesday, September 24, 2008 10:30:23 PM (Eastern Standard Time, UTC-05:00)
.Net | Back To Basics