Workflow services and distributed transactions — argh!

When you need to use MS DTC its normal to go through a stage of mentally preparing for some frustrations along the way. At least I do, but what I want to talk about is a very specific issue that surfaced only when I mixed workflow services and distributed transactions.

Starting a workflow service in a transaction is fairly straightforward and natively supported by Workflow Foundation. See TransactedReceiveScope documentation and use of TransactedReceiveScope code sample for more information. However, there is a common usage scenario that may get you into trouble with no clear reason. Lets say you have an application using workflow services and is leveraging the default SQL Server workflow instance store available in AppFabric. Additionally, the same application stores its data in a SQL Server database which, and this is the critical part, is also on the same SQL Server instance.

With the previous scenario if you need to transactionally start a workflow service and interact with the application database in the same transaction, such as to store the instance identifier of the started workflow service, you may get bitten by a race condition that will trigger the following, not so helpful, error message: “Transaction context in use by another session.”

The following sample code illustrates the previously mentioned scenario.

using (var context = new TransactionScope()) {
    using (var connection = CreateSqlConnection()) {
        // Start workflow
        string instanceId = client.Initiate();

        // Opening the connection after workflow start
        // leads to a race condition that causes errors
        connection.Open();

        SaveInstanceId(connection, instanceId);
    }

    context.Complete();
}

Searching the web for the error message will give you a few hits but none is specific to this given scenario. For example, you have a MSDN page about Using the TransactionScope Class that references the message but doesn’t go into details about it, you’ll also find a couple of StackExchange questions mostly about this problem when using loopback linked servers which also does not apply and then — thank god — there is a MSDN blog post that explains the issue and makes the solution clear.

… when two or more separate SqlConnections from same process or even different processes attempt to simultaneously enlist in the same distributed transaction, one or more may fail to enlist and report an exception. The reason for this is the server side code has no tolerance for multiple concurrent enlist operations on the same transaction, it will just immediately fail one of them. Server will not try to wait for a little and try again, server will not queue the requests, it will just immediately fail the conflicting one. So far the SQL Product team has no plans to support this functionality (simultaneously enlist two different connections in the same distributed) right now.
Freist Li, System.Transaction may fail in multiple-thread environment

With knowledge about the root cause it is now possible to update the sample code to make sure that the connection to the application database is not opened at exactly the same time as the connection opened by workflow runtime to persist the workflow in the instance store. The following sample code illustrates the fix:

using (var context = new TransactionScope()) {
    using (var connection = CreateSqlConnection()) {
        // Opening the connection before workflow start
        // eliminates the race condition
        connection.Open();

        // Start workflow
        string instanceId = client.Initiate();

        SaveInstanceId(connection, instanceId);
    }

    context.Complete();
}

This problem occurs at least in SQL Server 2008 R2, but judging by the blog post mentioned above this is a behavior that it’s not likely to change so it may also surface in more recent versions. A full sample application that illustrates the issue and can be used to test the behavior of more recent SQL Server versions is available at WFTransactions repository. Just remember that this is a race condition so even the wrong code will work most of the times.

Why Software Development is a pain?

Because it depends!

Entangled wires
Entangled by Jinx!, CC BY-SA 2.0.

Joke aside, I find that the most difficult challenge developers struggle with is that most things that really matter in terms of impact on the success of a software project depend on guidelines, patterns, best practices and whatever other name you may call it.

Every time I hear someone defend some piece of code with a pattern or a guideline I keep hoping to hear the real justification behind it, like what does the project gain from adopting this guideline or following this pattern, but people generally stop at the pattern/guideline reference.

C’mon folks, that’s not enough, we call them patterns, guidelines and best practices for a reason… they are applicable in a given context and will most likely help you, but they cannot be applied blindly. Otherwise, they would be called rules and software development would be a breeze.

For example, just because someone says to prefer interfaces over abstract classes you should not hardcode in your subconscience that an interface will solve all your problems. In the same tone you can always solve a problem by adding another layer of abstraction, except when your problem becomes to many layers of abstraction.

Another one of my favorites is the quest on eliminating any possible duplication. Don’t get me wrong, I totally believe in DRY but you need to have some common sense because if two things are the same now but may change due to completely different reasons you should carefully evaluate if you gain anything from trying to avoid the so-called duplication.

You should do stuff that helps you tame the complexity of the software you’re developing even if it contradicts one or more guidelines or best practices. If you want to know more on this subject, you should read Code Complete by Steve McConnel, it’s a pretty big book, but totally worth it.

Structuring your .NET solution on the file system

This is probably a subject at the same level of controversy as the one about where to put those pesky curly braces but let’s try setting personal preferences aside and look at it in an objective way.

It’s a fact that developers love consistency or at least in my opinion if they don’t then it is a disqualifying criteria in any job interview. However, another thing that developers tend to love is their own sense of consistency, basically they love being consistent in the way they like the best so it is important to pick one approach and make everyone stick to it.

When it comes to structuring a .NET solution in the filesystem I generally see two approaches being used:

  1. Hierarchical;
  2. Mostly flat.

In the first one its typical to see each part of a project name map to a physical folder, in this scenario, for Contoso.Web.Controls.[cs|k]proj you would expect to find the code in Contoso\Web\Controls\ path. In the mostly flat camp, the code for the previously mentioned project would be found at a folder named Contoso.Web.Controls\.

I’m in the mostly flat camp because I value that the number of parts in a project name does not affect the level at which it will be physically located. This tends to simplify things a lot when it comes to working with the filesystem structure either in build scripts or just for setting common output folders.

Another reason in favor of the mostly flat structure is that it aligns best with what we see more in the .NET open source community, see EntityFramework or ASP. NET MVC. This second part is specially important now that .NET itself made significant steps to become an open source framework.

Couple this second approach with the structure proposed by David Fowler and you have a quick set of rules that will allow you to be consistent within your project while at the same time following the overall trend in the .NET ecosystem.

A Different Kind of Assembly Hell

An assembly is first and foremost a deployment unit, they should normally be used to group code that works together and is deployed together or putting it the other way around, they are used to split code that may not be deployed together.

There are reasons to split code between multiple assemblies even if you intend to deploy them together but these are exceptions to the rule. I would see independent versioning requirements as one possible exceptional reason.

What you really shouldn’t do is create assemblies just for the sake of splitting your code for cosmetic reasons. I’m not saying that you shouldn’t organize your code, just saying there are better tools for that job. In this case, that would be namespaces alongside project folders and while on the subject of namespaces, another thing that really does not make any sense is to try to have a single namespace per assembly. If you’re down that path, take a step back cause you’re doing it wrong…

I saw, more than one time, .NET solutions suffer from this assembly explosion and quickly escalating to the hundreds of assemblies for something that it sure as hell wasn’t that complex and where 80% of the assemblies end up being deployed together due to a high level of dependencies.

However, you also need to avoid doing the opposite and cram everything in a single assembly. As pretty much everything in software development the correct answer depends on many things specific to the scenario at hand.

Be conscious of your decisions and why you make them.

Jump-start Your Mind

I haven’t written in a while mostly because I’ve spent my time reading what others wrote and today I’m writing purely motivated by what I just finished reading.

Mindfire: Big Ideas for Curious Mind by Scott Berkun was such a pleasure to read and ended up igniting parts of my mind which I have to admit were becoming a bit numb that I felt the urge to take some time to first say thank you to the author and then recommend it to all your curious minds out there.

Now, go read it… it’s time well spent.

We Don’t Need No Regions

If your code reaches a level where you want to hide it behind regions then you have a problem that regions won’t solve. Regions are good to hide things that you don’t want to have knowledge about such as auto-generated code. Normally, when you’re developing you end up reading more code than you write it so why would you want to complicate the reading process.

I, for one, would love to have that one discussion around regions where someone convinces me that they solve a problem that has no other alternative solution, but I’m still waiting. The most frequent argument I hear about regions is that they allow you to structure your code, but why don’t just structure it using classes, methods and all that other stuff that OOP is about because at the end of the day, you should be doing object oriented programming and not region oriented programming.

Having said that, I do believe that sometimes is helpful to have a quick overview of a code file contents and Visual Studio allows you to do just that through the Collapse to Definitions command (CTRL + M, CTRL + O) which collapses the members of all types; if you like regions, you should try this, it is much more useful to read all the members of a type than all the regions inside a type.

Unit Testing DateTime – The Crazy Way

We all know that the process of unit testing code that depends on DateTime, particularly the current time provided through the static properties (Now, UtcNow and Today), it’s a PITA.

If you go ask how to unit test DateTime.Now on stackoverflow I’ll bet that you’ll get two kind of answers:

  1. Encapsulate the current time in your own interface and use a standard mocking framework;
  2. Pull out the big guns like Typemock Isolator, JustMock or Microsoft Moles/Fakes and mock the static property directly.

Now each alternative has is pros and cons and I would have to say that I glean more to the second approach because the first adds a layer of abstraction just for the sake of testability. However, the second approach depends on commercial tools that not every shop wants to buy or in the not so friendly Microsoft Moles. (Sidenote: Moles is now named Fakes and it will ship with VS 2012)

This tends to leave people without an acceptable and simple solution so after reading another of these types of questions in SO I came up with yet another alternative, one based on the first alternative that I presented here but tries really hard to not get in your way with yet another layer of abstraction.

So, without further dues, I present you, the Tardis. The Tardis is single section of conditionally compiled code that overrides the meaning of the DateTime expression inside a single class. You still get the normal coding experience of using DateTime all over the place, but in a DEBUG compilation your tests will be able to mock every static method or property of the DateTime class.

An example follows, while the full Tardis code can be downloaded from GitHub:

using System;
using NSubstitute;
using NUnit.Framework;
using Tardis;

public class Example
{
    public Example()
        : this(string.Empty) { }

    public Example(string title)
    {
#if DEBUG
        this.DateTime = DateTimeProvider.Default;
        this.Initialize(title);
    }

    internal IDateTimeProvider DateTime { get; set; }

    internal Example(string title, IDateTimeProvider provider)
    {
        this.DateTime = provider;
#endif
        this.Initialize(title);
    }

    private void Initialize(string title)
    {
        this.Title = title;
        this.CreatedAt = DateTime.UtcNow;
    }

    private string title;

    public string Title
    {
        get { return this.title; }
        set
        {
            this.title = value;
            this.UpdatedAt = DateTime.UtcNow;
        }
    }

    public DateTime CreatedAt { get; private set; }
    public DateTime UpdatedAt { get; private set; }
}

public class TExample
{
    public void T001()
    {
        // Arrange
        var tardis = Substitute.For<IDateTimeProvider>();
        tardis.UtcNow.Returns(new DateTime(2000, 1, 1, 6, 6, 6));

        // Act
        var sut = new Example("Title", tardis);

        // Assert
        Assert.That(sut.CreatedAt, Is.EqualTo(tardis.UtcNow));
    }

    public void T002()
    {
        // Arrange
        var tardis = Substitute.For<IDateTimeProvider>();
        var sut = new Example("Title", tardis);
        tardis.UtcNow.Returns(new DateTime(2000, 1, 1, 6, 6, 6));

        // Act
        sut.Title = "Updated";

        // Assert
        Assert.That(sut.UpdatedAt, Is.EqualTo(tardis.UtcNow));
    }
}

This approach is also suitable for other similar classes with commonly used static methods or properties like the ConfigurationManager class.