Wednesday, April 02, 2014

Validating entities using Fluent Validation Library.

Problem: I have set of entities which are being used in service. And i need to validate them for following condition.
Property is mandatory
Property is a string of length 30
Property has value  either “PSHP” or “STOR”
If property in point 4 has value then another property also as a value.

Solution : The idea is to use Fluent Library nuget package
PM>  Install-Package FluentValidation
So our entity is as below

public class Parcel
    {
        public string OrderNumber { get; set; }
        public string Description { get; set; }
        public string NodeType { get; set; }
        public int NodeId { get; set; }
        public string NodeName { get; set; }
        public Address Address { get; set; }
    }

OrderNumber is mandatory
Description Not mandatory but can be of maximum length 30
NodeType can have value of “PSHP” or “STOR”
if NodeType id supplied then NodeId and NodeName should also be supplied.
Create Class lets call it ParcelValidator after our entity Parcel.
This class will implement generic class called  AbstractValidator.
Now lets see how we can implement the mandatory behaviour.
Rule 1
public ParcelValidator()
{
    RuleFor(parcel => parcel.OrderNumber).Length(1, 15).NotEmpty().WithMessage("Order Number is required");
}


Simples!!!

You can even extract the message from resource file if you want.

Rule no 2:


RuleFor(parcel => parcel.Description).Length(0, 30).WithMessage("Description can be only 30 characters");


The fact that we say length from 0, 30 makes it only 30 characters.

Rule no 3 & 4:


 When(parcel => !string.IsNullOrEmpty(parcel.NodeType), () =>
  {
       RuleFor(parcel => parcel.NodeType).Length(0, 4).WithMessage("Invalid Node type");
       RuleFor(parcel => parcel.NodeType).Must(BeAValidNodetype).WithMessage("Invalid Node type");
       RuleFor(parcel => parcel.NodeId).NotEmpty().WithMessage("Nodeid is required");
       RuleFor(parcel => parcel.NodeName).NotEmpty().WithMessage("Node name is required");
            
    });

Since we want to check the properties in rule 3 & 4 only if NodeType is provided we need to use the When, Unless construct.
Again since we want to check if the node type is “PSHP” or “STOR” we will have private method which must be obeyed. Again this check is implemented by using the keyword Must() and passing the name of the method.
Lets Look at the code all together.
public class ParcelValidator : AbstractValidator<Parcel>
    {
 
        public ParcelValidator()
        {
            RuleFor(parcel => parcel.OrderNumber).Length(1, 15).NotEmpty().WithMessage("Order Number is required");
            RuleFor(parcel => parcel.Description).Length(0, 30).WithMessage("Description can be only 30 characters");
            RuleFor(parcel => parcel.Address.HouseNo).NotEmpty().WithMessage("House Number is required");
            When(parcel => !string.IsNullOrEmpty(parcel.NodeType), () =>
            {
                RuleFor(parcel => parcel.NodeType).Length(0, 4).WithMessage("Invalid Node type");
                RuleFor(parcel => parcel.NodeType).Must(BeAValidNodetype).WithMessage("Invalid Node type");
                RuleFor(parcel => parcel.NodeId).NotEmpty().WithMessage("Nodeid is required");
                RuleFor(parcel => parcel.NodeName).NotEmpty().WithMessage("Node name is required");
            });
 
 
        }
 
        private bool BeAValidNodetype(string nodeType)
        {
            if (string.IsNullOrEmpty(nodeType))
                return true;
            if ((nodeType == "PSHP") || (nodeType == "STOR"))
            {
                return true;
            }
            return false;
        }
    }


Conclusion: The fluent validation library is very easy to implement the rules especially the rule number 4. Also, what is not apparent from the set of rule is that i am using the same entity for different clients and they have different set of rules in terms of length and whether or not the property is mandatory.
Complete project is available for download at GitHub

Sunday, March 30, 2014

Validation using Data annotations .net Framework vs. Fluent Validation


I have heard many people throw strong opinions about their way of doing things. How they do things or think of things is best. Wherever you go you will find these opinionated species. The most common you hear is Microsoft sucks. Any ways the point i am trying to get to here is everything has its own plus and minuses. Its out there for you to judge and use the best tool you want.
Most recently in a project i had a chance to work on a project where i had to make the same decision as to which validation library i should use. Note: apart from these two there are others too. Like the Enterprise Library validation block.  etc. The reason i made a choice between these two is because i wanted something light weight and the project was medium sized. Had the project been on large side i would have gone for Enterprise Library as it provides me the full kitchen sink.
Validation Using Data annotations  (System.Component Model.Data Annotations ) in .Net Frame work
Why i would choose it
  1. Easy to implement , validation and messages can be set via decorating the properties
  2. Baked into framework no need to refer external nuget packages
Fluent Validation library
Why i would choose it


  1. Can implement different validation for same entity depending on the condition. In a multi tenant system one of client has Name of length 30 and other has 40
  2. Can easily implement custom validations, this can be done via Data Annotations as well but its more easy to implement in case of Fluent Validation as its about writing the Lambda expression and function and be done with it. Whereas in case of Data Annotation i need to do ceremony of implementing the attributes.
  3. Can set localised messages eaisly
  4. This is a bit extreme however, since i can separate the validation in a dll in an assembly separate from the entities. And hence when i want to update just the validation i just have to deploy the assembly for validation.


      

Saturday, March 08, 2014

Reading data from a flat file

Problem:

Read the data from flat/delimited  file which has huge number of columns

Solution:

Currently i am working on few integration projects and interesting ones too. Especially the last one as i had to get my tool out of my arsenal. The project involved importing around 600,000 rows into the database. So far so go you would say that's easy use stream reader to read the data and be done with it. As they say devil is in the detail as it happened the row contained data for 107 columns in database.
As you can see 600,000 rows with data for 107 columns you can imagine the number of variables you would have to write to read the data and put it in the class. Which can then be inserted in the database. The approach i wanted was and less manual typing. This is because the more more typing i do with as many as 107 variables more chances of me making mistake.
What would be the sample code which would allow us to read the data from the file.Assuming that we are reading data into data table.
dataRow[“DataColumnName”] = line.Substring(58, 2).Trim(); 

So without any further ado let me tell you what tools you will need and how to do it. You will need Visual Studio, of course no getting away from that. And the humble ‘Microsoft Excel’. I used excel you could use any tool which provides similar functionality. Yes you read it right Excel.
Step 1:
Define your destination tables with exactly the same data types and length .
Step 2:
Once the table is defined get the definition back and put the exact same column names in excel column 1  and field length in column 2 which would look something like this.
image
Step 3:
Next we need to declare all the variables, the idea here is to keep the column name in data table same as the column in db. Because we are going to use SQL Bulk copy to insert data as described in this post. Also, if we need to use the column names somewhere else in the code we can do so easily once we have declared them as constants.Again we will need to declare the field length and field start as constants. Also add the data type of the column this will finally look like .
image
Step 4:
Now with these both the above steps done we are ready for producing our code.The code is which we want to produce looks like below
image
and how did this happen , simple formula
image
Now i have used a column for declaring a constant i.e. private const int & private const string, a ‘;’ and a ‘,’. You may very well put it in a formula, i am just being lazy thats all. The advantage it gives me is if i wanted to declare one variable and public and other as protected i can still do it. Not that i would have needed to in this case. Also, it lets me keep my formula clean.
Now  we just combine different columns to get the following results
image
Create data table code
image
image
Code to read different parts of the line into the data table.
image
image
Now the point to note is i have used the start point as pure number , it should have been again a constant declaration. Why? simply because say you make an error in calculation of the start of one of the column , you would have to regenerate the whole of read line record. Instead if it were a constant all you do is just change the constants declaration. Its a matter of opinion. I just like to keep it clean that's all.

Conclusion:

The whole process of variable declaration and reading the individual column is automated, so i do minimal coding and typing. Thus reducing the human error part.  Also, its replicable for me i can use and reuse it any time there are more columns which i need to add or amend.

Sunday, March 02, 2014

How to read command line arguments in .net

 

More than often when we write Console Application in .net , we are are required to parse the command line arguments. And most of the times just parsing the arguments is not the operation you do. You need to validate the argument. It would be nice to have some library which we can use every time we build console applications in .net.

Args is just that sort of library.  

All we need to do is create a class with typed properties, the arguments which you need to pass in form the properties. The complete documentation is available on GitHub

Wednesday, February 26, 2014

Importing Data from a flat file to SQL Server using SQLBulkCopy


Problem: Import a file with over 600,000 records which is around 250MB size. The import needs to be quick.
Solution: Recently i was tasked with a carrier integration project. Now this carrier configuratiion file conatined data for 160 diffrent columns of data. Traditional file reading and inserts using Enity Framework or SQL Inserts was not going to work.
The first idea i which came to my mind is creating a SSIS package, this would have been a perfect solution as SSIS is designed just for it. However, i had a restriction in place whereby i had to use SQLExpress only.
This meant i needed something which i could do bulk inserts with. In comes SQLBulkCopy, since i was going to import into SQL Server SQLBulkCopy was perfect for this case. Belive it or not this class existed in the framework since 2.0. It has been less well known class for good reason. The reason being we dont write importing routine everyday as compared to other more routine operations.
The idea on the high level was to read data from the file insert into a datatable and then use SQLBulkCopy to insert into the database. Now writing the code to read the file was another intresting part, there is another post coming on that too. Apart from using Datatable , we could use IDataReader as well.
I choose DataTable as i could use Datatable name and coulumn names later to do the mapping and table name while inserting the data
Namespace System.Data.SqlClient
Step 1: Create DataTable
var networkTable = new DataTable("dbo.Networks");
networkTable.Columns.Add(SequenceNumber, typeof(string));
networkTable.Columns.Add(CarrierId, typeof(string));
networkTable.Columns.Add(ModId, typeof(string));
networkTable.Columns.Add(Barcodes, typeof(double));

Step 2: Populate  DataTable
newRow[SequenceNumber] = line.Substring(5, 3).Trim();
newRow[CarrierId] = line.Substring(8, 2).Trim();
newRow[ModId] = line.Substring(10, 3).Trim();          
newRow[Barcodes] = Convert.ToInt32(line.Substring(31,1).Trim());

The code sample is kept breif on purpose.
Step 3: Insert Records
private void insertRecords(DataTable dataTable)
       {
           using (var bulkCopy = new SqlBulkCopy(SQlConnection,SqlBulkCopyOptions.KeepIdentity | SqlBulkCopyOptions.TableLock))
           {
               bulkCopy.DestinationTableName = dataTable.TableName;
               bulkCopy.BatchSize = 5000;
               try
               {
                   foreach (var column in dataTable.Columns)
                   {
                       bulkCopy.ColumnMappings.Add(column.ToString(), column.ToString());
                   }
                   bulkCopy.WriteToServer(dataTable);
               }
               catch (Exception ex)
               {
                   throw ex;
               }
           }
       }

As we can see from the function above using datatable allows me generlise the function so i can reuse it from anywhere in the project.
The SQL bulk copy options of SqlBulkCopyOptions.KeepIdentity  & SqlBulkCopyOptions.TableLock are in place so that we can preserve the identity values. Also TableLock option allows us to lock the table during inserts.
The batch size of 5000 indicates that a batch of 5000 records will be inserted in a single go.
During the operation i had couple of error messages come up as below:
Error 1 : Failed to convert parameter value from a String to a Int32
After encountering this error i set the SQL Copy option of SqlBulkCopyOptions.KeepIdentity so that the indentity columns are ignored during the inserts.

Error 2 : The given value of type String from the data source cannot be converted to type nvarchar of the specified target column.
Error 3 :The given value of type String from the data source cannot be converted to type int of the specified target column
After encountering errors 2 & 3 i set up the column mapping code so that the right columns are inserted from source to destination. Initially i had assumed keeping the column names source and destination same would be enough.

Error 4 : The given ColumnMapping does not match up with any column in the source or destination.
The error 4 was because of mistake on my part. But the thing to remember is the coulumn mapping are case sensitive. I had the column names same with the casing. However, the mistake i had made was trailing spaces in the column name of datatable.
Conclusion : The import process took 55 seconds to import of sample 550,000 lines(reading + inserting in db)