Thursday, 7 May 2009

How to Compare two DataTables in ADO.net

Sometimes it’s very useful to compare two DataTables and return the different records to a third DataTable. I came across this when I needed to handle my own data concurrency control. Basically my logic is:

- Populate a DataTable of a DataSet and Copy the content of the DataTable to another DataTable immediately after this. (For example we populate a DataTable A and then make a copy to DataTable B.)
- When user update the DataTable A in my application, the event handler firstly populate to a DataTable C from the database.
- Comparing the DataTable C with the DataTable B, if they are same means underlying data hasn’t been changed, then proceed the update otherwise user can choose either to overwrite the data or cancel the operation.

Unfortunately you can’t do a straight comparison of two DataTables. For instance if you write something like: DataTable 1 == DataTable2 or DataTable1.GetHashCode() == DataTable2..GetHashCode() or DT1.Equals(DT2), none of this would work. Below is the code I am using to compare to DataTables, pretty self-explanatory. Hopefully you find it helpful.

private void button1_Click(object sender, EventArgs e)
{
// Something to do with the Initialization of the FirstDataTable and SecondDataTable

DataTable dt;
dt = getDifferentRecords(FirstDataTable, SecondDataTable);

if (dt.Rows.Count == 0)
MessageBox.Show("Equal");
else
MessageBox.Show("Not Equal");
}

#region Compare two DataTables and return a DataTable with DifferentRecords
public DataTable getDifferentRecords(DataTable FirstDataTable, DataTable SecondDataTable)
{
//Create Empty Table
DataTable ResultDataTable = new DataTable("ResultDataTable");

//use a Dataset to make use of a DataRelation object
using (DataSet ds = new DataSet())
{
//Add tables
ds.Tables.AddRange(new DataTable[] { FirstDataTable.Copy(), SecondDataTable.Copy() });

//Get Columns for DataRelation
DataColumn[] firstColumns = new DataColumn[ds.Tables[0].Columns.Count];
for (int i = 0; i < firstColumns.Length; i++)
{
firstColumns[i] = ds.Tables[0].Columns[i];
}

DataColumn[] secondColumns = new DataColumn[ds.Tables[1].Columns.Count];
for (int i = 0; i < secondColumns.Length; i++)
{
secondColumns[i] = ds.Tables[1].Columns[i];
}

//Create DataRelation
DataRelation r1 = new DataRelation(string.Empty, firstColumns, secondColumns, false);
ds.Relations.Add(r1);

DataRelation r2 = new DataRelation(string.Empty, secondColumns, firstColumns, false);
ds.Relations.Add(r2);

//Create columns for return table
for (int i = 0; i < FirstDataTable.Columns.Count; i++)
{
ResultDataTable.Columns.Add(FirstDataTable.Columns[i].ColumnName, FirstDataTable.Columns[i].DataType);
}

//If FirstDataTable Row not in SecondDataTable, Add to ResultDataTable.
ResultDataTable.BeginLoadData();
foreach (DataRow parentrow in ds.Tables[0].Rows)
{
DataRow[] childrows = parentrow.GetChildRows(r1);
if (childrows == null || childrows.Length == 0)
ResultDataTable.LoadDataRow(parentrow.ItemArray, true);
}

//If SecondDataTable Row not in FirstDataTable, Add to ResultDataTable.
foreach (DataRow parentrow in ds.Tables[1].Rows)
{
DataRow[] childrows = parentrow.GetChildRows(r2);
if (childrows == null || childrows.Length == 0)
ResultDataTable.LoadDataRow(parentrow.ItemArray, true);
}
ResultDataTable.EndLoadData();
}

return ResultDataTable;
}
#endregion

5 comments:

  1. Good post, many thanks.

    The only issue we have found with this technique is that you get duplicate records in the resulting DataTable and no real indication of what the differences are. e.g. Inserts, Updates or Deletes. We have also attempted using the Merge function but this also errors with Constraint issues. We are trying to find a way to get the differences only and an idea of what has changed.

    ReplyDelete
  2. Good Code but ResultDataTable is returning the empty rows as different how to solve this.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. It worked perfect! Thank you!

    ReplyDelete
  5. This solution is flawed.
    This does not account for when the two datatables return a different amount of rows.
    You should implement a if check to see if FirstDataTable and SecondDataTable have the same number of rows first, before calling getDifferentRecords().

    ReplyDelete