Workflow services and distributed transactions — argh!

When you need to use MS DTC its normal to go through a stage of mentally preparing for some frustrations along the way. At least I do, but what I want to talk about is a very specific issue that surfaced only when I mixed workflow services and distributed transactions.

Starting a workflow service in a transaction is fairly straightforward and natively supported by Workflow Foundation. See TransactedReceiveScope documentation and use of TransactedReceiveScope code sample for more information. However, there is a common usage scenario that may get you into trouble with no clear reason. Lets say you have an application using workflow services and is leveraging the default SQL Server workflow instance store available in AppFabric. Additionally, the same application stores its data in a SQL Server database which, and this is the critical part, is also on the same SQL Server instance.

With the previous scenario if you need to transactionally start a workflow service and interact with the application database in the same transaction, such as to store the instance identifier of the started workflow service, you may get bitten by a race condition that will trigger the following, not so helpful, error message: “Transaction context in use by another session.”

The following sample code illustrates the previously mentioned scenario.

using (var context = new TransactionScope()) {
    using (var connection = CreateSqlConnection()) {
        // Start workflow
        string instanceId = client.Initiate();

        // Opening the connection after workflow start
        // leads to a race condition that causes errors
        connection.Open();

        SaveInstanceId(connection, instanceId);
    }

    context.Complete();
}

Searching the web for the error message will give you a few hits but none is specific to this given scenario. For example, you have a MSDN page about Using the TransactionScope Class that references the message but doesn’t go into details about it, you’ll also find a couple of StackExchange questions mostly about this problem when using loopback linked servers which also does not apply and then — thank god — there is a MSDN blog post that explains the issue and makes the solution clear.

… when two or more separate SqlConnections from same process or even different processes attempt to simultaneously enlist in the same distributed transaction, one or more may fail to enlist and report an exception. The reason for this is the server side code has no tolerance for multiple concurrent enlist operations on the same transaction, it will just immediately fail one of them. Server will not try to wait for a little and try again, server will not queue the requests, it will just immediately fail the conflicting one. So far the SQL Product team has no plans to support this functionality (simultaneously enlist two different connections in the same distributed) right now.
Freist Li, System.Transaction may fail in multiple-thread environment

With knowledge about the root cause it is now possible to update the sample code to make sure that the connection to the application database is not opened at exactly the same time as the connection opened by workflow runtime to persist the workflow in the instance store. The following sample code illustrates the fix:

using (var context = new TransactionScope()) {
    using (var connection = CreateSqlConnection()) {
        // Opening the connection before workflow start
        // eliminates the race condition
        connection.Open();

        // Start workflow
        string instanceId = client.Initiate();

        SaveInstanceId(connection, instanceId);
    }

    context.Complete();
}

This problem occurs at least in SQL Server 2008 R2, but judging by the blog post mentioned above this is a behavior that it’s not likely to change so it may also surface in more recent versions. A full sample application that illustrates the issue and can be used to test the behavior of more recent SQL Server versions is available at WFTransactions repository. Just remember that this is a race condition so even the wrong code will work most of the times.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s