Fast MySQL query to diff tables
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a MySQL query that will compare two tables with identical schema, looking for differences. One of the columns will have unique values (call it "ref" - this is not the primary key) and it will be used to link records in the two tables.

I would like the result to have a column for "type" and a value of "new", "deleted" or "changed". There cannot be any overlap between these, so for example "changed" can only contain records that exist in both tables and where at least one column (except "ref" and the primary key) has a change in it, new can only contain records that are in table2 but not in table1, and deleted can only contain records that are in table1 but not in table2.

No duplicates, please inquire for any clarification.

awarded to alv-c
Tags
MySQL
sql

Crowdsource coding tasks.

3 Solutions


Hello,
this query should do the work but i cannot test it

select ref, "deleted" from table1 
    where ref not in (select ref from table2) 
union 
select ref, "new" from table2 
    where ref not in (select ref from table1)
union 
select ref, "changed" from table1 
    where table1.ref = table2.ref 
    and row(table1.col1, table1.col2...) != row(table2.col1, table2.col2, ...)
Winning solution

I made this query based on kerncy's solution.

http://pastebin.com/NfABRQrT

Left comments so you know where to edit to add more fields to make a comparation if a record has changed.
Any change you need, tell me! (maybe not even showing records that still the same?)

I like it, thanks!
Araunah 3 months ago

Note - you can simplify this further by taking further advantage of the left join (saved 13% on a quick test I did)
Also I would expect an index on ref column on both tables would help.

# you have table1 and table 2 with same schema
select ref, "new" as diff from table2`
    where ref not in (select ref from table1)
union
select t1.ref, (
  case
  when t2.ref is null and t1.ref is not null
      then 'deleted'
  when t1.val<>t2.val /* or t1.val2<>t2.val or t1.val3<>t2.val3 */
      then 'changed'
  else     'same'
  end
  ) as diff
from table1 t1
left join table2 t2 on t2.ref=t1.ref  ; 
View Timeline