The CSV file format in Effort

This post describes the file format that is compatible with the CsvDataLoader component of the Effort library.

The component accepts files that follow the traditional CSV format:

  • The first row contains the header names
  • Comma ( , ) is the separator character
  • Double quote ( ” ) is the delimiter character
  • Double double quote ( “” ) is used to express a single double quote between delimiters

There are some additional requirements that need to be taken into consideration.

  • Numbers and dates are parsed with invariant culture setting
  • Binaries are encoded in base64 format
  • Null values are represented with empty fields without delimiters
  • Empty strings are represented with empty fields with delimiter
  • Backslash serves as escape character for backslash and newline characters

These are all the rules that are need to be followed. The next example demonstrates the rules by representing a compatible CSV file.

id,name,birthdate,reportto,storages,photo
"JD","John Doe",01/23/1982,"MHS","\\\\server1\\share8\r\n\\\\server2\share3",
"MHS","Michael ""h4x0r"" Smith",05/12/1975,,"","ZzVlKyszZjQ5M2YzNA=="

The content of each database table is represented by a dedicated csv file that has to be named as {table name}.csv.

Data loaders in Effort

Data loaders are useful components in the Effort library that were designed to help setting up the initial state of a fake database.

Adding records to the tables by using the Entity Framework API can be inflexible and the written code might become hard to maintain. Furthermore these type of insert operations flow through the entire EF and Effort pipeline. This might have great performance impact. Data loaders solves these problem by allowing to insert data from any custom source during the initialization with extra small overload.

They are really easy to use, all the developer has to do is to create a data loader instance and pass it to the chosen Effort factory method. For example:

var dataLoader = new EntityDataLoader("name=MyEntities");

var connection = DbConnectionFactory.CreateTransient(dataLoader);

Effort provides multiple built-in data loaders:

  • EntityDataLoader
  • CsvDataLoader
  • CacheDataLoader

EntityDataLoader is able to fetch data from an existing database by utilizing an existing Entity Framework compatible ADO.NET provider. It is initialized with an entity connection string.

var dataLoader = new EntityDataLoader("name=MyEntities");

The purpose of CsvDataLoader is to read data records from CSV files. It is initialized with a path that points to a folder containing the CSV files. Each file represents the content of a database table.

var dataLoader = new CsvDataLoader(@"C:\path\to\files");

The exact format of these CSV files are documented in a separate post. There is also a little tool that helps the developers to export the data from an existing database into appropriately formatted CSV files.

The CachingDataLoader was designed to speed up the initialization process by wrapping any kind of data loader with a cache layer. If the wrapped data loader is specified with a specific configuration the first time, the CachingDataLoader will pull the required data from the wrapped data loader. As a side effect, this data is going to be cached in the memory. If the CachingDataLoader was initialized to wrap the same kind of data loader with the same configuration again, then the data will be retrieved from the previously create cache, the wrapped data loader will not be utilized.

var wrappedDataLoader = new CsvDataLoader(@"C:\path\to\files");

var dataLoader = new CachingDataLoader(wrappedDataLoader, false);

Each data loader can be used in different scenarios. I suggest to use EntityDataLoader during interactive testing, while CachingDataLoader and CsvDataLoader combined can be really useful if they are utilized in automated tests.