this.Blog.Find(entry => entry.IsHelpful);
 Friday, October 17, 2008
No Comment

During our daily standup the other day, one of the developers was complaining about the lack of commenting in a module that was written by a 3rd party consulting firm.  This got many frustrated nods from the other developers in the group, which in turn led to a discussion of comments overall in our code.  There seemed to be a general consensus among the group that the commenting in our code needed to be improved overall.

The reason I found it interesting (and by interesting I mean to me only) was the the timing of it all. 

I’ve been on a kick lately to read nothing but Agile books (instead of the usual books on specific languages or technologies).  The one currently occupying my bedside table is a book by Robert C. Martin (aka “Uncle Bob”) called Clean Code : An Handbook of Agile Software Craftsmanship

Literally only a day or two before this particular stand-up meeting, I had read the section on commenting in the book.  In fact, Uncle Bob’s views on the subject were so strong that he dedicated an entire chapter to it.  He argues that comments overall are a “necessary evil” and are only needed as a means “to compensate for our failure to express ourself in code.”  He hopes someday that languages evolve to DSLs that are self-documenting.

He reasons that comments are bad because Agile code changes over time.  Oftentimes, the comments don’t get maintained along with the code.  So the more code gets changed, the comments become further away from the current intent of that code.  In fact, he boldly states that comments lie (though not necessarily intentionally), and that inaccurate comments do more harm than no comments at all.  A bad comment is worse than no comment (maybe Uncle Bob was a lawyer in a former life?). 

Well, if that’s the case, why bother commenting at all?

He does give some examples of useful comments:

  • Legal comments, such as copyright headers, author statements, or ownership rights
  • Warning of consequences
  • TODO comments, which can be caught in IDEs as warnings

On the other hand, he provides numerous examples of bad comments:

  • Documenting a “hack”
  • Comments that mirror the functionality and/or naming of the code piece it describes
  • Poorly written comments, which can be harder to read than the code itself
  • Journal comments that log each time the code is changed
  • Noisy comments that provide no value (e.g. // ignore )
  • Position markers (e.g. //////// Public Methods /////////////)
  • Closing brace comments (e.g. // end while)
  • Commented out refactored code (a personal pet peeve of mine)

I don’t know if I agree with the extremist approach of removing all comments from my code.  But I do agree that comments are often abused as a replacement for good code.  The rule of thumb I came away from the book is: if you feel the need to document a complex chunk of code with a comment, that is a sure-fire sign of a code smell.  Step back and look at the code from an maintenance perspective.  Is this code likely to change?  What are the odds the comments will change with it?  Is the comment articulating my intent well?  Have I named my variables/methods/classes intuitively?  Most importantly - can I refactor the code into a standalone method that can be named expressively?

Comments can be useful, but many times end up being the poor man’s excuse for lazy coding.  Don’t be that guy (or gal).  Strive to make your code readable without having to resort to comments.


Kick it on DotNetKicks.com
Friday, October 17, 2008 8:54:34 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  Agile | Back To Basics | Programming

 Wednesday, September 24, 2008
Favor Composition Over Inheritance

I've come across all kinds of architectures in my time.  Some have been good, some bad (yet somehow the bad vastly outnumber the good).  Even on some of the good architectures I've seen, a common problem I've found is the abuse of object inheritance. 

It's tempting to create an inheritance hierarchy to quickly get all the functionality provided by a common base class.  Through one little symbol or keyword, you can easily extend your object to get all the functionality provided by some other class.  However, be aware this kind of modeling comes with a price, and many times it's a price more steep than you should be willing to pay. 

Which is why a lot of people much smarter than me - Josh Bloch, Venkat Subramaniam, Andy Hunt, Dave Thomas, Robert "Uncle Bob" Martin, and Steve McConnell - have all argued that using composition instead of inheritance creates a much more Agile object model.  There are many reasons why Agile developers feel this way.

Conceptually

"Is-a" vs. "Has-a"

In inheritance modeling, class inheritance should only be used to represent true "subtype" relationships.  Every subclass should be a subtype of the class it is extending.  Object B should only extend Object A if a logical "is-a" relationship exists.  In other words, you need to ask yourself, "Is every B really an A?"  If you cannot truthfully answer yes to this question, there is no reason for B to extend A other than simple convenience.  And convenience is not a good reason for creating an inheritance hierarchy. 

Steve McConnell explains further in his seminal book Code Complete,

"If the derived class isn't going to adhere completely to the same interface contract defined by the base class, inheritance is not the right implementation technique."

Rather, you are better served modeling a "has-a" relationship between the two classes, where A is a private instance in the implementation details of B.  B may depend on the services provided by A, but is not a logical subtype of it.  By "wrapping" the reference to A, you will have a more conceptually accurate and flexible implementation.

It's all about the context!

A consumer of an object should be able to intuitively decipher its purpose from its public API.  When you force an inheritance hierarchy of two objects that don't have an "is-a" relationship, you have created an subclass object that contains methods, properties, fields, etc. from two different contexts.  By combining these contexts, you have created a confusing API that causes the consumer of it to have to actively track down the meaning behind methods exposed from the two contexts.

For example, suppose I have an Account object that extends TransactionCoordinator.  The two objects have two different contexts, and the design of their individual APIs most likely reflect that fact.  As a consumer of the Account object, when I see methods such as Start(), Commit(), Rollback() that have been inherited from the base class representing a different context, I am confused.  Intuitively, an Account is not a TransactionCoordinator, and I would have to know that inheritance relationship even exists to understand why I am seeing these methods.  Normally, I wouldn't expect to see them there.  What do these methods mean?  What do they do?  If I'm lucky, I can look at the API documentation (if it exists) on the Account object and it will detail each method and its functionality and implications when called from the subclass.  More likely, I will have to open the Account object up, see that it extends the TransactionCoordinator, and go there to research what each method actually does.  And only then can I have the ability to decide if that behavior is something I want to invoke on my subclass.

Here’s the lesson - don't make consumers of your API work any harder than they have to.

Liskov's Substitution Principle

Barbara Liskov wrote one of most ground-breaking papers on object-oriented programming in 1998.  In this paper, Liskov proposed that you shouldn't inherit from a base class unless the derived class truly "is-a" more specific version of the base class.  In what became known as the "Liskov Substitution Principle", Liskov argued that any derived class must be able to be completely substituted for its base class interface without any knowledge of the calling class.  In other words, subclasses must only change the implementation and not the meaning or intent of the base class interface. 

Say you have an abstract base Account object.  Callers that reference the base class API should not have to know or care which whether a CheckingAccount or SavingsAccount derived class actually provides the implementation.  To take the example further, suppose there is a CalculateInterest() method declared on the Account object.  In normal implementations it calculates the interest due to the consumer.  Suppose you introduce a BankLoanAccount which changes the implementation to calculate the interest for the bank.  This class violates the LSP since it changes the meaning of the implementation of the CalculateInterest() method, so now the caller cannot freely call the method on the base class reference.

Single Responsibility Principle

I'm a firm believer in the Single Responsibility Principle.  Classes should have one responsibility, and one responsibility only.  Think of it as the old Henry Ford assembly line - classes should do their one thing and do it well.  There is no reason for them to know the inner details or the workings in the classes in which it interacts.  This leads to smaller, more focused classes.  And small, purposeful classes are A GOOD THING. 

Single responsibility creates classes that are highly cohesive, while limiting coupling to external dependencies.  High cohesion and low coupling is an admirable goal in any software architecture.

Planning for change

By limiting classes to one responsibility, you are also creating a class that has only one reason to change.  As a developer who believes in Agile methodologies, I know change is going to come.  Limiting the reasons for classes to change, as well as the surface area in which that change will be felt, will go a long way towards making those changes easier to implement.

Technical

Introduces Dependencies

Anytime you inherit from a class, you are explicitly creating a dependency on that super class.  The code to the super class may change over time.  Each time it does change, it can potentially directly or indirectly break some or all of its subclasses, even though the code in subclasses may not have changed at all.  Most likely, the subclasses had no idea the change even occurred at all.  And unless you have planned for this contingency in your unit test (you are writing unit tests, right???), you may not even find this breaking change until after the code is deployed.  And if you are writing unit tests, that's just more (seemingly avoidable) unit tests you need to write to handle these cases.

This creates a lot more work for the API producer.  All subclasses now will have to be designed (and tested) to evolve in conjunction with their super classes.  That's more code and more unit tests you now have to write and maintain.

Fragile classes

Due to rules we must live with when choosing a static, compiled language, we limit ourselves in the structuring of our API.  Anytime we define a method signature in any class, this signature causes a ripple effect up and down the inheritance hierarchy.  That's both good and bad.  Each base class automatically inherits the new functionality, but guess what - it may not want to do so.  Or worse yet, it can break existing functionality in the subclass.

For example, introducing or changing a method signature in the base class will flow down to every class that derives from it.  What if that subclass already has a method defined in that class with the same name?  This will cause either: a compilation error if the method cannot be overridden; or worse yet unexpected behavior at runtime if the method can be overridden as the subclass implementation is now unknowingly invoked.  Or, what about if the subclass has the same method name but a different signature?  You will be forced to change your API in one of the classes just to get them to compile.  The same problems manifest itself as methods are introduced in base classes, but at least you are a probably little more aware of the potentially breaking changes.

Think this is a far-fetched example?  I don't.  Imagine common method names like Execute(), Activate(), Register(), or IsAvailable(), Open(), Load(), or Save().  All of these are all methods names that are prolific in our object models, and could easily have much different signatures as well as potentially meaning very different things in different contexts.

What if the behavior introduced in a base class implementation is not desired in the subclass?  Sorry, your S.O.L.  Good and bad, you get everything that comes down from the base.

Refactoring

Due to all the problems with carrying "API baggage", classes with deep hierarchies are much harder to refactor.  Are you completely comfortable with the API provided by your super class?  Because anytime you extend it, any flaws/issues/concerns you have with it are propagated down to the subclasses.  Not only that, the flaws now become further spread across your object model through the expanded surface area of all of its own and now its subclasses consumers.  Composition is much more flexible.  It has the advantage of masking, improving, or changing the API to encapsulate the existing concerns.

Not only that, but now you have tied yourself to this mis-matched inheritance hierarchy, and it will be much harder to introduce a new base class if a new valid one ever presents itself.

Summary

Inheritances is one of the most powerful and most abused concepts in object-oriented programming languages today.  True, inheritance is an easy and useful way to provide quick code reusability across your object model.  But the drawbacks of doing so far outweigh the short-term gains achieved.  Oftentimes the reusability gained comes as the expense of code extensibility, maintainability, and testability.  In the long run, ironically both producers and consumers of the API will have to do more work to maintain the fragile inheritance hierarchy.  Favor composition over inheritance to limit fragile inheritance hierarchies and create a much more flexible and powerful API.


Kick it on DotNetKicks.com
Wednesday, September 24, 2008 10:30:23 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  .Net | Back To Basics

 Saturday, July 19, 2008
Don't call virtual methods in a constructor

I ran into a piece of code today similar to this:

public class Base
{
    public Base()
    {
        this.Initialize();
    }

    protected virtual void Initialize()
    {
        Debug.WriteLine("Base.Initialize()");
    }
}

public class Derived : Base
{
    public Derived() : base() {}

    protected override void Initialize()
    {
        base.Initialize();
    }
}

public class FurtherDerived : Derived
{
    public FurtherDerived() : base() {}

    protected override void Initialize()
    {
        Debug.WriteLine("FurtherDerived.Initialize()");
    }
}

public class Tester
{
    public static void Main(string[] args)
    {
        Base myBase = new Base();

        Derived myDerived = new Derived();

        FurtherDerived myFurtherDerived = new FurtherDerived();

        Console.ReadLine();
    }
}

See any problems here?  Even though the compiler lets you call virtual methods from a base constructor, it's generally a "bad idea".  That's because the constructor in the derived classes defers its construction up to its base classes before it executes its own constructor (even if we don't explicitly call the base constructor like in the example).  Yet the call to a virtual function in the base class constructor is executed in the derived class implementation.  This can cause subtle, unexpected problems.

In the example above, things "happen" to go well.  It actually runs just fine.  It outputs the following as you'd probably expect:

Base.Initialize()
Base.Initialize()
FurtherDerived.Initialize()

But that's only because we didn't do anything significant in the derived constructors.  What happens if we change the FurtherDerived class:

public class FurtherDerived : Derived
{
    StringBuilder _sb;

    public FurtherDerived() : base()
    {
        _sb = new StringBuilder();
    }

    protected override void Initialize()
    {
       _sb.Append("FurtherDerived.Initialize()");
       Debug.WriteLine(_sb.ToString());
    }
}

Guess what happens when we run this code? FAIL!!! The dreaded, "Object reference not set to a reference of an object". Since the Initialize() method in the FurtherDerived class is called from the Base constructor, we are trying to access the _sb class level parameter before it's ever initialized.  We are calling a method on an class we know has not been fully constructed.  That ain't good.

The lesson is - never call a virtual method from a constructor.  At best, you are introducing a bug waiting to happen.  At worst, you've already done so.


Kick it on DotNetKicks.com
Friday, July 18, 2008 11:57:17 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  .Net | Back To Basics | C#

 Sunday, June 22, 2008
Enforce non-instantiability on static classes

Every project ends up with one or even possibly many static helper classes.  If you program in Java, I'm willing to bet the house your project has a StringUtil class lurking around somewhere.  These classes are easily abused, but can serve a valid purpose, such as a factory class responsible for creating instances. 

If these classes contain only static methods (as they usually do), there is never any reason for them to to be instantiated.  However, in both Java and C#, it's easy to forget that if you do not explicitly declare a constructor, the compiler will generate a no-arg constructor by default.  The consequence is that the following code is valid:

StringUtil util = new StringUtil();

You've just allocated memory unnecessarily.  Plus, the garbage collector will now incur extra overhead since it will have to reclaim this memory even though it was never actually used.  All of the methods are static, so there is no reason to instantiate an individual object from this class. 

To avoid instantiations, make sure you declare a private no-arg constructor.  This will cause the compiler to throw an error any time the class is attempted to be instantiated.

public class StringUtil
{
    // constructor marked private since all methods are static
    private StringUtil() {}

    // bunch of static methods not shown
}

A side note - If you are using C# 3.0, I recommend using extension methods to enhance the behavior of existing classes rather than creating a something like a dedicated StringUtil class.  First of all, any class declaring extension methods has to be marked as static and cannot be instantiated.  And more importantly, it provides a much cleaner implementation to add those helper methods directly on the affected class.  The added bonus is that Visual Studio is smart enough to provide Intellisense for extension methods on targeted classes.


Kick it on DotNetKicks.com
Sunday, June 22, 2008 9:35:25 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  Back To Basics | C# | Java | Programming