Re: comparison tool
  Home FAQ Contact Sign in
microsoft.public.sqlserver.tools only
 
Advanced search
POPULAR GROUPS

more...

 Up
Re: comparison tool         

Group: microsoft.public.sqlserver.tools · Group Profile
Author: Dariusz Dziewialtowski
Date: Sep 17, 2008 18:53

> Can you explain "row-by-rowcomparison"?  Do you mean you inspect the data
> in both tables and match rows manually?  Or do you pick columns to "pretend"
> they are the key?  Otherwise, without a key (whether PK or UQ), how could
> the software even attempt to guess at matching rows on its own?

First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.
no comments
diggit! del.icio.us! reddit!