Hi All. We need to test some changes that we will be rolling out. The changes are for performance and the data that is output should not change. To verify this, the plan was to create a before snapshot table by selecting all records from the graphical view, roll out the changes, and then verify the records returned from the same graphical view didn't change. Note: The data in the base tables is static.
As a POC, I created a snapshot table using
CREATE TABLE schema.snapshottable AS (SELECT * FROM "PUBLIC"."GRAPHICAL_VW") WITH DATA;
and then compared it back to the graphical view using different techniques.
Using Minus/Except
select * from GRAPHICAL_VW
minus (tried except as well)
select * from schema.snapshottable
Expected no rows returned
Received numerous rows
Using Union
select count(*) from
(
select * from GRAPHICAL_VW --300k records
union
select * from schema.snapshottable --300k records
)
Expected Since the table was filled with all records from the view, I expected the union to distinct all records and return a count equal to the table row count (300k)
Received ~408k or so. I manually checked some of the duplicate rows and they looked to match exactly, as expected, so I don't understand why they didn't reduce down to one row.
Other Attempts without success
Created table A and table B one after another with the same create statement from the same view. Similar Results
Created row tables instead of column tables.
Question
Can anyone explain what I missing here?
Any suggestions on how to compare two data sets for equality, i.e. the before and after snapshot are identical? Is there a table checksum type feature?
Thanks in advance for the help.