this.Blog.Find(entry => entry.IsHelpful);
 Wednesday, September 24, 2008
Favor Composition Over Inheritance

I've come across all kinds of architectures in my time.  Some have been good, some bad (yet somehow the bad vastly outnumber the good).  Even on some of the good architectures I've seen, a common problem I've found is the abuse of object inheritance. 

It's tempting to create an inheritance hierarchy to quickly get all the functionality provided by a common base class.  Through one little symbol or keyword, you can easily extend your object to get all the functionality provided by some other class.  However, be aware this kind of modeling comes with a price, and many times it's a price more steep than you should be willing to pay. 

Which is why a lot of people much smarter than me - Josh Bloch, Venkat Subramaniam, Andy Hunt, Dave Thomas, Robert "Uncle Bob" Martin, and Steve McConnell - have all argued that using composition instead of inheritance creates a much more Agile object model.  There are many reasons why Agile developers feel this way.

Conceptually

"Is-a" vs. "Has-a"

In inheritance modeling, class inheritance should only be used to represent true "subtype" relationships.  Every subclass should be a subtype of the class it is extending.  Object B should only extend Object A if a logical "is-a" relationship exists.  In other words, you need to ask yourself, "Is every B really an A?"  If you cannot truthfully answer yes to this question, there is no reason for B to extend A other than simple convenience.  And convenience is not a good reason for creating an inheritance hierarchy. 

Steve McConnell explains further in his seminal book Code Complete,

"If the derived class isn't going to adhere completely to the same interface contract defined by the base class, inheritance is not the right implementation technique."

Rather, you are better served modeling a "has-a" relationship between the two classes, where A is a private instance in the implementation details of B.  B may depend on the services provided by A, but is not a logical subtype of it.  By "wrapping" the reference to A, you will have a more conceptually accurate and flexible implementation.

It's all about the context!

A consumer of an object should be able to intuitively decipher its purpose from its public API.  When you force an inheritance hierarchy of two objects that don't have an "is-a" relationship, you have created an subclass object that contains methods, properties, fields, etc. from two different contexts.  By combining these contexts, you have created a confusing API that causes the consumer of it to have to actively track down the meaning behind methods exposed from the two contexts.

For example, suppose I have an Account object that extends TransactionCoordinator.  The two objects have two different contexts, and the design of their individual APIs most likely reflect that fact.  As a consumer of the Account object, when I see methods such as Start(), Commit(), Rollback() that have been inherited from the base class representing a different context, I am confused.  Intuitively, an Account is not a TransactionCoordinator, and I would have to know that inheritance relationship even exists to understand why I am seeing these methods.  Normally, I wouldn't expect to see them there.  What do these methods mean?  What do they do?  If I'm lucky, I can look at the API documentation (if it exists) on the Account object and it will detail each method and its functionality and implications when called from the subclass.  More likely, I will have to open the Account object up, see that it extends the TransactionCoordinator, and go there to research what each method actually does.  And only then can I have the ability to decide if that behavior is something I want to invoke on my subclass.

Here’s the lesson - don't make consumers of your API work any harder than they have to.

Liskov's Substitution Principle

Barbara Liskov wrote one of most ground-breaking papers on object-oriented programming in 1998.  In this paper, Liskov proposed that you shouldn't inherit from a base class unless the derived class truly "is-a" more specific version of the base class.  In what became known as the "Liskov Substitution Principle", Liskov argued that any derived class must be able to be completely substituted for its base class interface without any knowledge of the calling class.  In other words, subclasses must only change the implementation and not the meaning or intent of the base class interface. 

Say you have an abstract base Account object.  Callers that reference the base class API should not have to know or care which whether a CheckingAccount or SavingsAccount derived class actually provides the implementation.  To take the example further, suppose there is a CalculateInterest() method declared on the Account object.  In normal implementations it calculates the interest due to the consumer.  Suppose you introduce a BankLoanAccount which changes the implementation to calculate the interest for the bank.  This class violates the LSP since it changes the meaning of the implementation of the CalculateInterest() method, so now the caller cannot freely call the method on the base class reference.

Single Responsibility Principle

I'm a firm believer in the Single Responsibility Principle.  Classes should have one responsibility, and one responsibility only.  Think of it as the old Henry Ford assembly line - classes should do their one thing and do it well.  There is no reason for them to know the inner details or the workings in the classes in which it interacts.  This leads to smaller, more focused classes.  And small, purposeful classes are A GOOD THING. 

Single responsibility creates classes that are highly cohesive, while limiting coupling to external dependencies.  High cohesion and low coupling is an admirable goal in any software architecture.

Planning for change

By limiting classes to one responsibility, you are also creating a class that has only one reason to change.  As a developer who believes in Agile methodologies, I know change is going to come.  Limiting the reasons for classes to change, as well as the surface area in which that change will be felt, will go a long way towards making those changes easier to implement.

Technical

Introduces Dependencies

Anytime you inherit from a class, you are explicitly creating a dependency on that super class.  The code to the super class may change over time.  Each time it does change, it can potentially directly or indirectly break some or all of its subclasses, even though the code in subclasses may not have changed at all.  Most likely, the subclasses had no idea the change even occurred at all.  And unless you have planned for this contingency in your unit test (you are writing unit tests, right???), you may not even find this breaking change until after the code is deployed.  And if you are writing unit tests, that's just more (seemingly avoidable) unit tests you need to write to handle these cases.

This creates a lot more work for the API producer.  All subclasses now will have to be designed (and tested) to evolve in conjunction with their super classes.  That's more code and more unit tests you now have to write and maintain.

Fragile classes

Due to rules we must live with when choosing a static, compiled language, we limit ourselves in the structuring of our API.  Anytime we define a method signature in any class, this signature causes a ripple effect up and down the inheritance hierarchy.  That's both good and bad.  Each base class automatically inherits the new functionality, but guess what - it may not want to do so.  Or worse yet, it can break existing functionality in the subclass.

For example, introducing or changing a method signature in the base class will flow down to every class that derives from it.  What if that subclass already has a method defined in that class with the same name?  This will cause either: a compilation error if the method cannot be overridden; or worse yet unexpected behavior at runtime if the method can be overridden as the subclass implementation is now unknowingly invoked.  Or, what about if the subclass has the same method name but a different signature?  You will be forced to change your API in one of the classes just to get them to compile.  The same problems manifest itself as methods are introduced in base classes, but at least you are a probably little more aware of the potentially breaking changes.

Think this is a far-fetched example?  I don't.  Imagine common method names like Execute(), Activate(), Register(), or IsAvailable(), Open(), Load(), or Save().  All of these are all methods names that are prolific in our object models, and could easily have much different signatures as well as potentially meaning very different things in different contexts.

What if the behavior introduced in a base class implementation is not desired in the subclass?  Sorry, your S.O.L.  Good and bad, you get everything that comes down from the base.

Refactoring

Due to all the problems with carrying "API baggage", classes with deep hierarchies are much harder to refactor.  Are you completely comfortable with the API provided by your super class?  Because anytime you extend it, any flaws/issues/concerns you have with it are propagated down to the subclasses.  Not only that, the flaws now become further spread across your object model through the expanded surface area of all of its own and now its subclasses consumers.  Composition is much more flexible.  It has the advantage of masking, improving, or changing the API to encapsulate the existing concerns.

Not only that, but now you have tied yourself to this mis-matched inheritance hierarchy, and it will be much harder to introduce a new base class if a new valid one ever presents itself.

Summary

Inheritances is one of the most powerful and most abused concepts in object-oriented programming languages today.  True, inheritance is an easy and useful way to provide quick code reusability across your object model.  But the drawbacks of doing so far outweigh the short-term gains achieved.  Oftentimes the reusability gained comes as the expense of code extensibility, maintainability, and testability.  In the long run, ironically both producers and consumers of the API will have to do more work to maintain the fragile inheritance hierarchy.  Favor composition over inheritance to limit fragile inheritance hierarchies and create a much more flexible and powerful API.


Kick it on DotNetKicks.com
Wednesday, September 24, 2008 10:30:23 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  .Net | Back To Basics