The CSV file format in Effort

This post describes the file format that is compatible with the CsvDataLoader component of the Effort library.

The component accepts files that follow the traditional CSV format:

  • The first row contains the header names
  • Comma ( , ) is the separator character
  • Double quote ( ” ) is the delimiter character
  • Double double quote ( “” ) is used to express a single double quote between delimiters

There are some additional requirements that need to be taken into consideration.

  • Numbers and dates are parsed with invariant culture setting
  • Binaries are encoded in base64 format
  • Null values are represented with empty fields without delimiters
  • Empty strings are represented with empty fields with delimiter
  • Backslash serves as escape character for backslash and newline characters

These are all the rules that are need to be followed. The next example demonstrates the rules by representing a compatible CSV file.

id,name,birthdate,reportto,storages,photo
"JD","John Doe",01/23/1982,"MHS","\\\\server1\\share8\r\n\\\\server2\share3",
"MHS","Michael ""h4x0r"" Smith",05/12/1975,,"","ZzVlKyszZjQ5M2YzNA=="

The content of each database table is represented by a dedicated csv file that has to be named as {table name}.csv.

Advertisements

Data loaders in Effort

Data loaders are useful components in the Effort library that were designed to help setting up the initial state of a fake database.

Adding records to the tables by using the Entity Framework API can be inflexible and the written code might become hard to maintain. Furthermore these type of insert operations flow through the entire EF and Effort pipeline. This might have great performance impact. Data loaders solves these problem by allowing to insert data from any custom source during the initialization with extra small overload.

They are really easy to use, all the developer has to do is to create a data loader instance and pass it to the chosen Effort factory method. For example:

var dataLoader = new EntityDataLoader("name=MyEntities");

var connection = DbConnectionFactory.CreateTransient(dataLoader);

Effort provides multiple built-in data loaders:

  • EntityDataLoader
  • CsvDataLoader
  • CacheDataLoader

EntityDataLoader is able to fetch data from an existing database by utilizing an existing Entity Framework compatible ADO.NET provider. It is initialized with an entity connection string.

var dataLoader = new EntityDataLoader("name=MyEntities");

The purpose of CsvDataLoader is to read data records from CSV files. It is initialized with a path that points to a folder containing the CSV files. Each file represents the content of a database table.

var dataLoader = new CsvDataLoader(@"C:\path\to\files");

The exact format of these CSV files are documented in a separate post. There is also a little tool that helps the developers to export the data from an existing database into appropriately formatted CSV files.

The CachingDataLoader was designed to speed up the initialization process by wrapping any kind of data loader with a cache layer. If the wrapped data loader is specified with a specific configuration the first time, the CachingDataLoader will pull the required data from the wrapped data loader. As a side effect, this data is going to be cached in the memory. If the CachingDataLoader was initialized to wrap the same kind of data loader with the same configuration again, then the data will be retrieved from the previously create cache, the wrapped data loader will not be utilized.

var wrappedDataLoader = new CsvDataLoader(@"C:\path\to\files");

var dataLoader = new CachingDataLoader(wrappedDataLoader, false);

Each data loader can be used in different scenarios. I suggest to use EntityDataLoader during interactive testing, while CachingDataLoader and CsvDataLoader combined can be really useful if they are utilized in automated tests.

Factory methods in Effort (CreateTransient vs CreatePersistent)

The Effort library provides multiple factory classes that developers can choose from. All of them can serve well in different scenarios.

  • DbConnectionFactory
  • EntityConnectionFactory
  • ObjectContextFactory

They are capable of creating different type of fake data endpoints: DbConnection, EntityConnection and ObjectContext, respectively. Fake DbConnection objects are ideal for using them in the Code First programming approach. The purpose of fake EntityConnection objects is to utilize them in the Database First and Model First techniques. Instantiate ObjectContext objects by passing them as constructor argument. This can be also done automatically by creating fake ObjectContext instances directly.

All the factory components provide two kind of factory methods.

  • CreateTransient
  • CreatePersistent

What is the difference between them? The answer lies in the lifecycle of the underlying in-memory database bound to the endpoint. Let’s examine these factory methods with an extremely simple demonstration that uses the DbConnectionFactory component.

The demo includes a single Person entity with two members:

public class Person
{
    [Key]
    public int Id { get; set; }

    public string Name { get; set; }
}

The DbContext class that provides the entity endpoint can be defined easily too. Note that it needs a constructor that can accept a DbConnection object.

public class PeopleDbContext : DbContext
{
    public PeopleDbContext(DbConnection connection)
        : base(connection, true)
    {
    }

    public IDbSet People { get; set; }
}

First, the CreateTransient method is demonstrated. The following code instantiates a DbContext object with a fake DbConnection object and adds some data to the fake database. Then creates another DbContext instance with a newly created fake DbConnection and tries to query the previously added data.

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreateTransient()))
{
    ctx.People.Add(new Person() { Id = 1, Name = "John Doe" });
    ctx.SaveChanges();

    Console.WriteLine("Test 1 - First Count: {0}", ctx.People.Count());
}

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreateTransient()))
{
    Console.WriteLine("Test 1 - Second Count: {0}", ctx.People.Count());
}

This code outputs the following:

Test 1 - First Count: 1
Test 1 - Second Count: 0

The second DbContext instance is not able to see the entity added by the first DbContext instance. Every time a fake connection object is created with the CreateTransient method, the new fake connection object will redirect the data operations to a completely new unique in-memory database instance. No other endpoint can use that database, it is completely isolated. In addition the database is cleaned up when the connection object is no longer in use.

Change the previous code a little bit: let the CreatePersistent method be used this time. Note that this method accepts a string argument. This string identifies the fake database instance that the connection object is bound to.

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreatePersistent("1")))
{
    ctx.People.Add(new Person() { Id = 1, Name = "John Doe" });
    ctx.SaveChanges();

    Console.WriteLine("Test 2 - First Count: {0}", ctx.People.Count());
}

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreatePersistent("1")))
{
    Console.WriteLine("Test 2 - Second Count: {0}", ctx.People.Count());
}

This code outputs the following:

Test 2 - First Count: 1
Test 2 - Second Count: 1

The second DbContext instance is able to see the newly added entity this time, because the connection objects are bound to the same database instance. This is possible because they were created with the same identifier.

In the last code sample the connection objects are created with the CreatePersistent method too, but with different identifiers.

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreatePersistent("2")))
{
    ctx.People.Add(new Person() { Id = 1, Name = "John Doe" });
    ctx.SaveChanges();

    Console.WriteLine("Test 3 - First Count: {0}", ctx.People.Count());
}

using (var ctx = new PeopleDbContext(Effort.DbConnectionFactory.CreatePersistent("3")))
{
    Console.WriteLine("Test 3 - Second Count: {0}", ctx.People.Count());
}

This code outputs the following:

Test 3 - First Count: 1
Test 3 - Second Count: 0

The second DbContext object is not able to see the added entity, because the fake connection objects are bound to separate database instances. The reason of this, that the identifiers that were used to create them are not identical.

These two kind of factory method can be useful in different scenarios. For example, if someone wants to write automated tests for data oriented components, then CreateTransient is the way to go, because it ensures that the tests can run in completely insolated environments. They can run even in parallel. If someone wants to test the system in an interactive way without screwing up the persisted data in the real database, then CreatePersistent might worth a shot.

Use them wisely!

Using Effort in complex applications

In a previous post some of the basic features of the Effort library has been introduced with very simple code samples. You might have asked: okay, but how should I use this tool in a complex application? This post tries to give an answer to this question.

Applications have to be designed properly in order to make them easily testable. A traditional technique called decoupling is widely used to achieve testability. It means that your system has to be built up from individual pieces (usually referred as components) that can be easily disassembled and reassembled. A previous post demonstrates a technique that makes possible to decouple Entity Framework based data driven applications. Do not continue without reading and understanding it! The solution presented here relies on the architecture described there.

In a complex application more instances of the same type of ObjectContext might be used during the serving of a single request, these ObjectContext instances usually have to work on the same database. The ObjectContextFactory class of the Effort library is only capable of creating individual ObjectContext instances without any relation between them, so each created ObjectContext instance works with a completely unique fake database. If you wanted to create ObjectContext instances that work on the exact same database, you should create a fake EntityConnection object and then pass it to all ObjectContext instances. The Effort library provides the EntityConnectionFactory class that was created for exactly this purpose. Its factory methods accepts an entity connection string as argument and retrieves an EntityConnection object that will communicate with an in-process in-memory fake database.

But how should this factory class be used in the architecture that was presented in the mentioned blog post? There are two kind of possible injection points in that system: slots with IObjectContextFactory or IConnectionProvider interfaces. The latter is just ideal for this scenario, because a single connection provider component is shared among the different data components, so they can use the exact same connection object. Before creating the fake component, lets take a look on the implementation of the original production component.

public class DefaultConnectionProvider : ConnectionProviderBase
{
    protected override EntityConnection CreateConnection()
    {
        return new EntityConnection("name=ClubEntities");
    }
}

The ConnectionProviderBase class made the implementation really self evident. This base class can be used to create the new fake connection provider component too. Simply use the mentioned factory class of the Effort library to instantiate the fake EntityConnection instance.

public class FakeConnectionProvider : ConnectionProviderBase
{
    protected override EntityConnection CreateConnection()
    {
        return Effort.EntityConnectionFactory.CreateTransient("name=ClubEntities");
    }
}

That’s it! Now just use the FakeConnectionProvider class instead of the DefaultConnectionProvider class (in the same way) and the data operations initiated by the data components will be executed on a single in-memory fake database. In this way automated tests can be created without the dependence on the external database.

As shown, using the Effort library in complex data driven applications could require precise architectural designing. However, in a properly built up system it can be integrated without too much effort.

Introducing Effort

So what is Effort? It stands for Entity Framework Fake ObjectContext Realization Tool. Basically this is exactly what it is meant to do. Creating automated tests for data driven applications has never been a trivial task. This is also true for Entity Framework based applications, implementing proper fake ObjectContext or DbContext class requires great effort. Oh, sure… 🙂

This library approaches this problem from a very different way, it tries to emulate the underlying resource-heavy external database with a lightweight in-process in-memory database. This makes possible to run your tests rapidly without the presence of an external database. At the end of this blog post, you will see exactly how.

Effort can be downloaded from Codeplex or installed with NuGet. It is really convenient to use, practically you don’t have to do any modification to your existing ObjectContext or DbContext classes. This is presented by the following example. Let’s assume that we have an ObjectContext class called NorthwindEntities.

using(NorthwindEntities context = new NorthwindEntities())
{
    return context.Categories.ToList();
}

This code returns all the categories stored in the database. A simple modification is enough to make Entity Framework use a fake in-memory database instead:

using(NorthwindEntities context = 
    Effort.ObjectContextFactory.CreateTransient<NorthwindEntities>())
{
    return context.Categories.ToList();
}

The term “transient” refers to the lifecycle of the underlying in-memory database. The owner ObjectContext (technically the DbConnection) will be using a completely unique database instance. If the context/connection is disposed, than the database will be disposed too. If you run this code, an empty collection will be returned. This is self-evident, since the fake database was completely empty.

You could set the initial state of the database with Entity Framework, but Effort provides data loaders to do this more easily. The following example fetches the initial data from a real database:

IDataLoader loader = new EntityDataLoader("name=NorthwindEntities");

using(NorthwindEntities context = 
    ObjectContextFactory.CreateTransient<NorthwindEntities>(loader))
{
    return context.Categories.ToList();
}

This and the first code will return exactly the same collection of entities, but there is a very big difference: you can do anything you want with this data context, the result will be the same all the time. The following example proves this:

IDataLoader loader = new EntityDataLoader("name=NorthwindEntities");

using(NorthwindEntities context = 
    ObjectContextFactory.CreateTransient<NorthwindEntities>(loader))
{
    foreach (Category cat in context.Categories)
    {
        context.Categories.DeleteObject(cat);
    }
    context.SaveChanges();
}

using(NorthwindEntities context = 
    ObjectContextFactory.CreateTransient<NorthwindEntities>(loader))
{
    return context.Categories.ToList();
}

The first part of this code deletes all the categories from the database. No matter, the second part of the code will return the exact same collection like before. If you run the code again, the first part has to delete the entities again, the object set will never be empty.

Furthermore, you can completely dismiss the need of the external database. Export your data tables into local CSV files (Effort provides a tool to do it easily) and use the CSV data loader.

IDataLoader loader = new CsvDataLoader("C:\PathOfTheCsvFiles");

using(NorthwindEntities context = 
    ObjectContextFactory.CreateTransient<NorthwindEntities>(loader))
{
    return context.Categories.ToList();
}

If you run this code, there will be zero communication with the external database, but it just behaves exactly like it had.

As you can see, Effort makes possible to create automated tests for Entity Framework applications in a very convenient and powerful way. The tests can run without the presence of any external database engine. Each test can work on a completely unique database instance, so their actions will be completely isolated, they can even run concurrently. The state of database they are working on can be easily initialized too.

Future posts will reveal the capabilities of Effort for different scenarios.