Category Archives: SOLID

Brewer’s CAP Theorem

Brewer’s CAP theorem is an important concept in scalability discussions. The theorem states that only two of the three items, Consistency, Availability, and Partition tolerance are achievable. Below are three illustrations of how this works. For the purposes of these examples, we will imagine a cluster of three storage nodes used to store user profiles.

Scenario A: Sacrificing Partition Tolerance

On each of the three nodes, we will only store a subset of the user profiles. This is called sharding. Node one will have users A-H, node two I-S, and node three T-Z. As long as each node is up and running, we have achieved a three times higher throughput than with a single node as each node only server a third of the traffic (assuming of course that user profile querying and updating is uniformly distributed through the alphabet). Consistency is achieved because immediately after data is written, it is accessible. Availability is achieved because each server is accessible in real time. However, we have lost the concept of partition tolerance as the disabling of one server has rendered a certain section of users unreachable. This carries the notion that upon hardware failure, data could have permanently been lost. All in all, not a good sacrifice under most circumstances.

Scenario B: Sacrificing Availability

On each of the three nodes, we will store all the user profiles. And furthermore, to guarantee data consistency and data loss prevention, we will ensure that every write into the system happens on all three nodes before it is completed. So, if were to update a profile for Bob McBob, any subsequent queries or writes on Bob McBob’s profile would be blocked until the update has completed. Even worse is when one of the nodes is lost but the requirement of three writes is still required, our entire system is unavailable until it is restored. This means that while our data is consistent and protected, we have sacrificed the availability of the data. This is a reasonable sacrifice in some systems. However, our goal is scalability and this does not fit that requirement.

Scenario C: Sacrificing Consistency

On each of the three nodes, we will store all the user profiles. However (and different than scenario B), we will acknowledge a completed write immediately and not wait for the other two nodes. This means that if a read comes in on node two for data written on node one, it may or may not be up-to-date depending on the latency of replication. We are still highly available and still partition tolerant (with respect to the latency it takes to replicate to another second node).

A majority of the time, scenario C is the chosen path for a couple of reasons. First, most business use cases do not require up-to-the-second information. Take, for instance, a generated report on the sales in a given region. While the business user may request “live” data, monitoring the usage of such a report will likely look as follows: 1) User prints a report and waits for x time (perhaps some coffee is obtained). 2) User imports report into Excel and slices and dices the data for y time. 3) User acts upon information. Overall, the decision is delayed from “live” data by x + y, which is most likely in the order of hours and is definitely not based on “live” data.

Second, a business benefit can be garnered in some cases. Let’s take an ATM for instance. Upon a withdrawal of funds, the ATM looks at the data available for a decision on whether or not to allow the transaction to proceed. It is not aware of any pending transfers to or from the account and is definitely not aware of what occurred in the last x minutes. If you were to use a mobile phone, move some money from your ATM account, and then inquire the ATM for a balance, the account would look no different and allow an overdraft. Ultimately, the bank, by choosing “eventual” consistency, has appropriated a fee.

In my next post, I’d like to discuss how this eventually consistent model applies to Event Sourcing and how we can structure our applications to take advantage of this in the context of enforcing transactional consistency.

Advertisements

Event Sourcing as the Canonical Source of Truth

Event Sourcing (ES) is a concept that enables us peruse the history of our system and know its state at any point in time.  A few reasons this is important range from investigating a bug only occurring under certain conditions to understanding why something was changed (why was customer X’s address changed).  Another distinct advantage of event sourcing is that we could rebuild an entire data store (SQL tables, MongoDB collections, flat files, etc…) by replaying each event against a listener.  This would look something like the below code:

var listeners = GetAllListeners();

foreach(var event in GetAllEvents())
{
    listeners.Handle(Event);
}

It’s quite simple and elegant, but more important is that it becomes the canonical data store; the single source of truth.  The alone yields some interesting possibilities.  One of which I’m quite fond is upon discovery of a poorly conceived database schema.  It is quite simple to redesign and build up as if it was in place on day one utilizing a likeness to the above code snippet.  In the same vein, imagine two separate applications needing access to the same data but having very different business models.  Instead of each application consuming a schema that makes sense for only one (or neither due to compromise), each has its own model serving its own needs.  Since neither is the canonical store of the data, duplication of data isn’t something to be frightened.

One thing to note as the discussion moves into scalability is that employing a denormalized schema design enabled by ES already increases our ability to scale.  When intersection of sets (sql joins) is unnecessary, queries against relational data sources perform much faster.

At this point, I have posted three articles, an introduction on how I got to where I am now, a discussion of CQRS, and now a discussion of ES.  I’d like to come full circle and discuss how CQRS + ES can be used to achieve further scalability, but first I need to address Brewer’s CAP Theorem and how it forms the backbone of many design decisions related to scalability.

Managing Complexity with CQRS

CQRS stands for command-query responsibility segregation.  It literally means to separate your commands from your queries, your reads from your writes.

This can take on many forms, the simplest having command messages differ from query messages.  It might seem obvious when stated like this, but I guarantee you have violated this idea numerous times.  I know I have.  For instance, take the example below of a client utilizing a service for a customer.

var customer = _service.GetCustomer(10);

customer.Address.Street = "1234 Blah St.";

_service.UpdateCustomer(customer);

The example above has 2 interesting characteristics.  First, we are sending the entire customer object back to update a single field in the address.  This isn’t necessarily bad, but it brings me to the second observation.

When looking through the history of a customer, the ability to tell why the street was changed (if at all) is impossible to discern.  The business intent of the change is missing.  Did the customer move?  Was there a typo in the street?  These are very different intentions that mean very different things.  Take for instance a business which sends letters when a customer moves confirming receipt of the change. The above snippet of code only allows us to send a letter when the address has changed.

Perhaps this would look a little better.

var customer = _service.GetCustomer(10);

var address = customer.Address;

address.Street = “1234 Blah St.”;

_service.CustomerHasMovedTo(customer.Id, address);

//or for a typo

_service.CorrectAddress(customer.Id, address);

While the above code may be a little more verbose, the intent is clear and the business can act accordingly.  Perhaps the business could disallow the move of a customer from Arizona (AZ) to Arkansas (AR) while still allowing a typo correcting an address that was supposed to be in Arkansas but was input incorrectly as Arizona.

In addition to business intent being important in the present, it is also important in the past.  The ability to reflect over historical events can prove an invaluable asset to a business.  In my next post, I’d like to discuss the Event Sourcing pattern.

My Road to CQRS

I remember my first job after graduating from college was building an internal .NET application interfacing with a legacy system.  This legacy system ran on an AS400 and was written in RPG utilizing a DB2 database for data storage.  When I looked at the database schema, I was horrified.  Opposite of any form of normalization, the real world apparently didn’t build applications as my school had taught.

That wasn’t true, however. The remainder of that job and every job thereafter, and most of the articles I read on the interwebs espouse the same principles I was taught.  Normalize your data, maintain referential integrity, run write operations inside a transaction, etc…  This seemed the universally accepted way to build systems.  So I continued on this learned trajectory churning out quality software meeting business requirements as specified.  When the database had trouble handling my 17 join query, we cached the results at the application layer.  When we could deal with day old data, we ran ETL processes at night to pre-calculate the source of the burdensome queries.

In retrospect, I wish these situations triggered my memory of that first legacy system.  The application cache and the extracted tables mirrored those early schemas that kept me up at night. And worse still is that these objects were not just use to read the data, but they were also used to update the data.  While I preached the principles of SOLID, I ignorantly violated the first letter in the acronym.

So, what did those original developers know that I didn’t? The original system was built to run on a mainframe with distributed terminal clients.  The mainframe does all the work while clients would simply view screens and then issue commands or queries to change the view or update the data respectively.  That resembles very closely the architecture of the web; a webserver on a box and a number of browsers connect to it to view data or post forms.  These days, our web servers can handle a whole lot of load (especially when load balanced), and much more than the original mainframes.  So, the mainframe guys supporting distributed clients are like a website supporting gazillions of hits a day (an hour?).  How did they manage this complexity?

CQRS stands for command-query responsibility segregation.  It literally means separating your commands from your queries; your reads from your writes.  They are responsible for different things.  Reads don’t have any business logic in them (aside from authorization perhaps).  So why did I keep insisting on a single model to rule them all?

This may sound complex and it can be.  In my next post, I want to delve into how CQRS can help us manage this complexity.

Build Your Own IoC Container User Group Recording

So, apparently the talk I gave on the Build Your Own IoC Container series was recorded and posted online. Being one of my first talks, I thought it went well. If I’d known how they were recording, I would have done a few things differently like repeat the questions that were asked, but overall it went pretty well.

There is no sound for about 3 minutes and then I get interupted by the guys running the group to announce some things, but after we get through that, it is pretty smooth.

http://usergroup.tv/videos/build-you-own-ioc-container

Hope you enjoy…

Building an IoC Container – Cyclic Dependencies

The code for this step is located here.

We just finished doing a small refactoring to introduce a ResolutionContext class. This refactoring was necessary to allow us to handle cyclic dependencies. Below is a test that will fail right now with a StackOverflowException because the resolver is going in circles.

public class when_resolving_a_type_with_cyclice_dependencies : ContainerSpecBase
{
    static Exception _ex;

    Because of = () =>
        _ex = Catch.Exception(() => _container.Resolve(typeof(DummyService)));

    It should_throw_an_exception = () =>
        _ex.ShouldNotBeNull();

    private class DummyService
    {
        public DummyService(DepA a)
        { }
    }

    private class DepA
    {
        public DepA(DummyService s)
        { }
    }
}

Technically, a StackOverflowException would get thrown and caught. However, this type of exception is going to take out the whole app and the test runner won’t be able to complete. Regardless, it shouldn’t take a minute to take out the heap and this should fail almost instaneously.

With a slight modification to our ResolutionContext class, we can track whether or not a cycle exists in the resoluation chain and abort early. There are two methods that need to be modified.

public object ResolveDependency(Type type)
{
    var registration = _registrationFinder(type);
    var context = new ResolutionContext(registration, _registrationFinder);
    context.SetParent(this);
    return context.GetInstance();
}

private void SetParent(ResolutionContext parent)
{
    _parent = parent;
    while (parent != null)
    {
        if (ReferenceEquals(Registration, parent.Registration))
            throw new Exception("Cycles found");

        parent = parent._parent;
    }
}

We begin by allowing the ResolutionContext to track it’s parent resolution context. This, as you can see in the SetParent method, will allow us to check each parent and see if we have tried to resolve a given type already. Other than that, nothing special is going on and everything else still works correctly.

At this point, we are at the end of the Building an IoC Container series. I hope you have learned a little more about how the internals of your favorite containers work and, even more so, that there isn’t a lot of magic going on. This is something you can explain to your peers or mentees and hopefully allow the use of IoC to gain an acceptance in area that has once been off-limits because it was a “black-box”. Be sure to leave me a comment if you have any questions or anything else you’d like to have done to our little IoC container.

Building an IoC Container – Refactoring

The code for this step is located here.

In the last post, we added support for singleton and transient lifetimes. But the last couple of posts have made our syntax look a bit unwieldy and is somewhat limitting when looking towards the future, primary when needing to detect cycles in the resolution chain. So today, we are going to refactor our code by introducing a new class ResolutionContext. This will get created everytime a Resolve call is made. There isn’t a lot to say without looking at the code, so below is the ResolutionContext class.

public class ResolutionContext
{
    private readonly Func _resolver;

    public Registration Registration { get; private set; }

    public ResolutionContext(Registration registration, Func resolver)
    {
        Registration = registration;
        _resolver = resolver;
    }

    public object Activate()
    {
        return Registration.Activator.Activate(this);
    }

    public object GetInstance()
    {
        return Registration.Lifetime.GetInstance(this);
    }

    public object ResolveDependency(Type type)
    {
        return _resolver(type);
    }

    public T ResolveDependency()
    {
        return (T)ResolveDependency(typeof(T));
    }
}

Nothing in this is really that special. The Activate method and the GetInstance method are simply here to hide away the details so the caller doesn’t need to dot through the Registration so much (Law of Demeter). The Funcis still here, but this is where it stops. Shown below, our IActivator and ILifetime interfaces now take a ResolutionContext instead of the delegates.

public interface IActivator
{
    object Activate(ResolutionContext context);
}

public interface ILifetime
{
    object GetInstance(ResolutionContext context);
}

Now, they look almost exactly the same, so, as we discussed in the last post, the difference is purely semantic. Activators construct things and Lifetimes manage them.

Finally, no new tests have been added, but a number have changed due to this refactoring. I’d advise you to check out the full source and look it over yourself. In our next post, we’ll be handling cyclic dependencies now that we have an encapsulated ResolutionContext to track calls.

Building an IoC Container – Adding Lifetimes

The code for this step is located here.

We left off the previous post with the need to support different lifetime models such as Singleton, Transient, or per Http Request. After our last refactoring, this is actually a fairly simple step if we follow the single responsiblity principle and define the roles each of our abstractions play.

Currently, we only have 1 abstraction which is around activation. It would be fairly simple to use activators to handle lifetime as well, but things will quickly become complex if we go this route. Therefore, we will define an activators job as activating an object. In other words, it’s only job is to construct an object using whatever means it wants, but it should never store the instance of an object in order to satisfy a lifetime requirement.

In order to manage lifetimes, we will introduce a second abstract called an ILifetime. It’s sole job is to manage the lifetime of an activated object. It can(and will) use an activator to get an instance of an object, but it will never construct one itself.

By keeping these two ideas seperate, it makes this a fairly trivial addition. Below are the two tests for this functionality, one for transient and one for singletons.

public class when_resolving_a_transient_type_multiple_times : ContainerSpecBase
{
    static object _result1;
    static object _result2;

    Because of = () =>
    {
        _result1 = _container.Resolve(typeof(DummyService));
        _result2 = _container.Resolve(typeof(DummyService));
    };

    It should_not_return_the_same_instances = () =>
    {
        _result1.ShouldNotBeTheSameAs(_result2);
    };

    private class DummyService { }
}

public class when_resolving_a_singleton_type_multiple_times : ContainerSpecBase
{
    static object _result1;
    static object _result2;

    Establish context = () =>
        _container.Register().Singleton();

    Because of = () =>
    {
        _result1 = _container.Resolve(typeof(DummyService));
        _result2 = _container.Resolve(typeof(DummyService));
    };

    It should_return_the_same_instances = () =>
    {
        _result1.ShouldBeTheSameAs(_result2);
    };

    private class DummyService { }
}

You see above that we are making transient the default lifetime and adding a method to set a singleton lifetime.

public interface ILifetime
{
    object GetInstance(Type type, IActivator activator, Func resolver);
}

public class TransientLifetime : ILifetime
{
    public object GetInstance(Type type, IActivator activator, Func resolver)
    {
        return activator.Activate(type, resolver);
    }
}

public class SingletonLifetime : ILifetime
{
    private object _instance;

    public object GetInstance(Type type, IActivator activator, Func resolver)
    {
        if (_instance == null)
            _instance = activator.Activate(type, resolver);
        return _instance;
    }
}

So, an ILifetime has a single method that takes the type, the activator, and the resolver. We are going to clean this up in the next installment, but regardless, these implementations are still relatively simple.

The TransientLifetime is basically a pass-through on the way to the activator. It doesn’t store anything, so it has no need to do any interception. The SingletonLifetime, however, only activates an object once, stores the instance, and then returns that instance everytime. I haven’t included threading in here, but a simple lock would suffice for most cases.

We need to add a Lifetime and a Singleton method onto our Registration class:

public class Registration
{
    //other properties

    public ILifetime Lifetime { get; private set; }

    public Registration(Type concreteType)
    {
        ConcreteType = concreteType;
        Activator = new ReflectionActivator();
        Lifetime = new TransientLifetime();

        Aliases = new HashSet();
        Aliases.Add(concreteType);
    }

    //other methods

    public Registration Singleton()
    {
        Lifetime = new SingletonLifetime();
        return this;
    }
}

Finally, we just need to change the Resolve method in our Container class to use the Lifetime as opposed to the Activator and we are all done.

public object Resolve(Type type)
{
    var registration = FindRegistration(type);
    return registration.Lifetime.GetInstance(registration.ConcreteType, registration.Activator, Resolve);
}

In our next post, we’ll do some refactoring ultimately to support handling cyclic dependencies. A by product of this is a better coded container. See you then…

Building an IoC Container–Resolving Abstractions

The code for this step is located here.

From our previous post, we added the ability to register dependencies with dependencies that couldn’t be resolved by the container.  These would be dependencies like primitives or abstractions like interfaces.  In this post, we are going to solve our inability to resolve an abstraction by adding aliases to the Registration class.

Below is our test for this functionality:

public class when_resolving_a_type_by_its_alias : ContainerSpecBase
{
    static object _result;

    Establish context = () =>
        _container.Register<DummyService>().As<IDummyService>();

    Because of = () =>
        _result = _container.Resolve(typeof(IDummyService));

    It should_not_return_null = () =>
        _result.ShouldNotBeNull();

    It should_return_an_instance_of_the_requested_type = () =>
        _result.ShouldBeOfType<DummyService>();

    private interface IDummyService { }
    private class DummyService : IDummyService { }
}

Above, the only difference from an API standpoint is the addition of the “As” method. This is basically telling the container that DummyService should be returned when IDummyService is requested.

So, the first change we’ll make is on the Registration class. We’ll add a property called Aliases. Aliases will include all the types that should resolve to the same concrete class including the concrete version. So, Register().As().As() will resolve when any of the types ISomeServiceA, ISomeServiceB, or SomeService is requested. Our new registration class looks like this:

public class Registration
{
    public Type ConcreteType { get; private set; }

    public IActivator Activator { get; private set; }

    public ISet<Type> Aliases { get; private set; }

    public Registration(Type concreteType)
    {
        ConcreteType = concreteType;
        Activator = new ReflectionActivator();

        Aliases = new HashSet<Type>();
        Aliases.Add(concreteType);
    }

    public Registration ActivateWith(IActivator activator)
    {
        Activator = activator;
        return this;
    }

    public Registration ActivateWith(Func<Type, Func<Type, object>, object> activator)
    {
        Activator = new DelegateActivator(activator);
        return this;
    }

    public Registration As<T>()
    {
        Aliases.Add(typeof(T));
        return this;
    }
}

The only other change we need to make is in the container where we are trying to find a registration.

private Registration FindRegistration(Type type)
{
    var registration = _registrations.FirstOrDefault(r => r.Aliases.Contains(type));
    if (registration == null)
        registration = Register(type);

    return registration;
}

That’s it! All the tests should still pass and all is good with the world. In the next post, we are going to talk about lifetimes (Singleton, Transient, PerRequest, etc…) and how to add them into our container.

Stay Tuned.

Building an IoC Container–Resolving Types with Unresolvable Dependencies

The code for this step is located here.

From our previous post, we are able to resolve types with dependencies that they themselves can be resolved.  We all know from experience that this is hardly ever true without some help.  For instance, a type that takes an integer in its constructor would be impossible to resolve in our current state.  To combat this, we are going to create something called an activator.  It’s entire purpose is to, given a type, give back an instance of that type through construction.  The interface is defined below:

public interface IActivator
{
    object Activate(Type type, Func<Type, object> resolver);
}

Nothing special here, but what is special is the abstraction we are creating between the activation of an instance and the implementation.  Our first implementation will be to refactor the current activation code embedded in the Resolve method of the container into a new activator called a ReflectionActivator.

public class ReflectionActivator : IActivator
{
    public object Activate(Type type, Func<Type, object> resolver)
    {
        var ctor = type.GetConstructors()
                       .OrderByDescending(c => c.GetParameters().Length)
                       .First();

        var parameters = ctor.GetParameters();
        var args = new object[parameters.Length];
        for (int i = 0; i < args.Length; i++)
            args[i] = resolver(parameters[i].ParameterType);

        return ctor.Invoke(args);
    }
}

With the exception of the resolver Func that is passed in and its usage, this code is identical to the Resolve method.  The resolver argument has the same signature as the Resolve method, thereby allowing us to perform the same recursion we were using previously.  We can now replace the existing code in Resolve with the below method:

public object Resolve(Type type)
{
    var activator = new ReflectionActivator();
    return activator.Activate(type, Resolve);
}

At this point, all our existing tests should still pass (because we really haven’t done anything).  Let’s go ahead and refactor some more allowing different types to be registered with their own settings.  This is the final refactoring before we can solve the original stated problem.

Below is the Registration class.  It has a property called ConcreteType.  I’m naming it that because I know some things about the future that you don’t, but regardless, it holds the actual type that can be constructed.

public class Registration
{
    public Type ConcreteType { get; private set; }

    public IActivator Activator { get; private set; }

    public Registration(Type concreteType)
    {
        ConcreteType = concreteType;
        Activator = new ReflectionActivator();
    }

    public Registration ActivateWith(IActivator activator)
    {
        Activator = activator;
        return this;
    }

    public Registration ActivateWith(Func<Type, Func<Type, object>, object> activator)
    {
        Activator = new DelegateActivator(activator);
        return this;
    }
}

The take away from the class above is that we have a ConcreteType that has an instance of IActivator associated with it.  Therefore, an unresolvable ConcreteType can be activated differently than one that is resolvable.  For instance, the implementation of the IActivator that solves our original problem is called the DelegateActivator.

public class DelegateActivator : IActivator
{
    private readonly Func<Type, Func<Type, object>, object> _activator;

    public DelegateActivator(Func<Type, Func<Type, object>, object> activator)
    {
        _activator = activator;
    }

    public object Activate(Type type, Func<Type, object> resolver)
    {
        return _activator(type, resolver);
    }
}

Yes, that is a Func that takes a Func as an argument.  This will get cleared up in a future refactoring, but for now, this is actually quite simple.  The nested Func argument is the Resolve method (give me a type, and I’ll give you back an instance).  So, that last thing to do is to refactor the Container class to use the Registration class and the Activators.

public class Container
{
    private List<Registration> _registrations = new List<Registration>();

    public Registration Register(Type type)
    {
        var registration = new Registration(type);
        _registrations.Add(registration);
        return registration;
    }

    public Registration Register<T>()
    {
        return Register(typeof(T));
    }

    public object Resolve(Type type)
    {
        var registration = FindRegistration(type);
        return registration.Activator.Activate(type, Resolve);
    }

    public T Resolve<T>()
    {
        return (T)Resolve(typeof(T));
    }

    private Registration FindRegistration(Type type)
    {
        var registration = _registrations.FirstOrDefault(r => r.ConcreteType == type);
        if (registration == null)
            registration = Register(type);

        return registration;
    }
}

Just to note, we are allowing the resolution of types that have not been explicitly registered with the container.  I believe that some other notable IoC container authors disagree with this practice, namely Krzysztof Koźmic and Nicolas Blumhardt.  Their reasoning is that without explicit configuration, it is easy for the container to misbehave.  I say, have enough tests so that doesn’t become an issue.  Regardless, this is how it is coded and if you disagree, rip out that part and throw an exception.

One other thing to note is that we are not threadsafe. This would be relatively simple to add using the new concurrent collection types in .NET 4.0. I’ll leave that to you to play around with.

Below is the new test we have added for this batch of functionality:

public class when_resolving_a_type_with_unresolvable_dependencies : ContainerSpecBase
{
    static object _result;

    Establish context = () =>
        _container.Register<DummyService>()
            .ActivateWith((t, r) => new DummyService(9, (DepA)r(typeof(DepA))));

    Because of = () =>
        _result = _container.Resolve(typeof(DummyService));

    It should_not_return_null = () =>
        _result.ShouldNotBeNull();

    It should_return_an_instance_of_the_requested_type = () =>
        _result.ShouldBeOfType<DummyService>();

    private class DummyService
    {
        public DummyService(int i, DepA a)
        { }
        }

    private class DepA { }
}

Like I said above, a little ugly with that ActivateWith call. It is saying to use 9 for the integer and resolve DepA from the container.

In the next post, we’ll get into how to handle resolving IDummyService, which is basically the whole point of this exercise. Stay tuned…