Code Trivia #3

We have a database containing one hierarchical table representing a categories tree. The table holds just two records, one the root of the tree (IdParent=NULL) and the other is a child of the first.

Using LINQ to SQL or the Entity Framework you could write the following code:

// Table Category contains 2 records
//
//  Record 1: Id=1 | IdParent=NULL
//  Record 2: Id=2 | IdParent=1

static int GetChildrenCount(int? idParent)
{
    int recordCount = -1;

    using (var context = new ExamplesDataContext())
    {
        recordCount = context.Categories
            .Where(c => c.IdParent == idParent)
            .Count();
    }

    return recordCount;
}

static void Main(string[] args)
{
    // Count root categories
    Console.WriteLine(GetChildrenCount(null));

    // Count children of category 1
    Console.WriteLine(GetChildrenCount(1));

    // Count root categories
    using (var context = new ExamplesDataContext())
    {
        int recordCount = context.Categories
            .Where(c => c.IdParent == null)
            .Count();

        Console.WriteLine(recordCount);
    }
}

What would be the output of this snippet?


Update:

This snippet would output the following;

// 0
// 1
// 1

This situation is caused by a design decision in LINQ to SQL where a comparison with the literal null translates to the appropriate SQL version (is null or is not null), but for a non literal comparison (comparing against a variable that holds null) the comparison is directly mapped to an equals comparison with null which as specified by default in SQL Server results in a comparison result of false.

As a workaround you can construct the where condition in GetChildrenCount like this:

static int GetChildrenCount(int? idParent)
{
    int recordCount = -1;

    using (var context = new ExamplesDataContext())
    {
        recordCount = context.Categories
            .Where(c => (c.IdParent == idParent || (idParent == null && c.IdParent == null)))
            .Count();
    }

    return recordCount;
}
Advertisement

SQLXML 4.0 Bulk Load Console Application

Earlier this year I had the need to do the initial data load for a new system. The data source was a highly complex and customized Excel spreadsheet that provided no way to easily automate the migration.

Manual processing it’s definitely not my cup of tea, so I decided that at least the result should be something much more manageable which immediately ruled out creating discrete SQL insert commands, given the hierarchical and highly relational nature of the data.

XML seemed like the way to go, but then I would need a custom application to interpret it and finally load it to the database. Fortunately Microsoft did just that with SQLXML, providing a very flexible way to load XML data into a relational database.

Being a console guy I ended up creating a quick console application that enabled me to more easily automate the process. The first version of this application contained almost all SQL XML configuration hardcoded, but recently I had some time and improved it with command line parameters for the available configuration. Try it here.

Exporting and Importing Data Using bcp Utility

Transferring data between Microsoft SQL Server instances or even between databases in the same instance presents some constraints when the table schema on each database don’t exactly match.

There are scenarios where we can workaround these limitations by using the format files supported by the bcp utility.

For the first scenario, as you can see in the following picture, we try to import data to a table similar to the source table, but with an extra column. Even though the new column can accept NULL values importing the data to this table will fail if a format file is not used.

Import to a table with an extra column
Import to a table with an extra column

Import operation fails  (zero rows imported):

bcp export.dbo.Contact out data.dat -n -S localhost -T
bcp import.dbo.Contact in data.dat -n -S localhost -T
pause

Import operation succeeds:

bcp export.dbo.Contact format nul -f Contact.ff -n -S localhost -T
bcp export.dbo.Contact out data.dat -f Contact.ff -S localhost -T
bcp import.dbo.Contact in data.dat -f Contact.ff -S localhost -T
pause

In the previous scenario the extra column was at the end of the table meaning that all remaining columns were in the same order in both tables. Another scenario is when both source and target tables have matching columns, but their order is different as illustrated in the next figure. This case can also be resolved using format files, but now we need to edit the file before the import operation to correctly map column positions.

Import to table with columns in different order
Import to table with columns in different order

Using the format file generated against the source table without modifications may result in errors during the import or even worse because if columns data types are compatible the import operation could succeed but the data would be stored in the wrong columns.

Import operation succeeds but data for Fax column is loaded into OtherPhone and vice-versa:

bcp export.dbo.Contact format nul -f Contact-exp.ff -n -S localhost -T
bcp export.dbo.Contact out data.dat -f Contact-exp.ff -S localhost -T
bcp import.dbo.Contact in data.dat -f Contact-exp.ff -S localhost -T
pause

To fix this issue we need to update the format file section representing the column order of the table. Doing this manually for several tables would be tedious and time consuming so I implemented a small utility (Format File Mapper) that given the format file associated to the source table and the format file associated to the target table will generate a mapped format file with the column’s order updated.

Import operation succeeds:

bcp export.dbo.Contact format nul -f Contact-exp.ff -n -S localhost -T
bcp export.dbo.Contact out data.dat -f Contact-exp.ff -S localhost -T
bcp import.dbo.Contact format nul -f Contact-imp.ff -n -S localhost -T
ffmapper -s Contact-exp.ff -t Contact-imp.ff -o Contact-mapped.ff
bcp import.dbo.Contact in data.dat -f Contact.Mapped.ff -S localhost -T
pause

This utility only supports non-XML format files and was only tested against format files generated by SQL Server 2005. However, from what I’ve seen in MSDN, SQL Server 2008 creates non-XML format files with the same structure so it should also work.

There are  of course other technologies better suited for data migration scenarios but for simpler cases and also for the command-line lover this can prove to be a very useful solution.