Goldengate Data Manipulation – When Inserts & Updates differ
17/04/2015 6 Comments
One very useful aspect of Golden Gate is to allow the manipulation of data between the source and the destinations.
One recent problem that I encountered was to alter the data differently for inserts than for updates. This was caused by the receiving system needing to have some default data in columns which may or may not be supplied by the insert or update statements. This is slightly more complex than first imagined:
- If we have an INSERT and the column value IS NULL, or the column value IS NOT SUPPLIED by the insert statement, we should set the default for that column.
- If we have an UPDATE and the column value IS NULL, then we should set the default for that column.
We must NOT set the default if the UPDATE does not supply the column, otherwise we may incorrectly overwrite data in the target system.
The first problem is that, by default, you are not allowed to have more than one table mapping per table. To get around this, you need to use the “ALLOWDUPTARGETMAP” parameter. You can then add multiple mappings.
You need to be aware that each mapping will fire for each transaction action. If you have 2 active table mappings for the same table, you will end up with 2 inserts/updates/deletes. Get this mapping wrong and your data integrity will be destroyed, and you will get a lot of constraint errors. In this case we have 2 mappings, one for inserts and one for updates and deletes. I need to use the get/ignore commands to indicate which actions each mapping should use.
The following example was for a data pump, but it is valid to do this for all extracts and replicats.
-- ggsci: add extract p_neil, exttrailsource /u02/gg/bin12/dirdat/NE, BEGIN NOW -- ggsci: add rmttrail ./dirdat/NP extract p_neil megabytes 100 EXTRACT p_neil USERID owner_goldengate@DB_LOCAL PASSWORD password passthru TARGETDEFS ./dirdef/defgen.neil.def RMTHOST remote.server.world mgrport 7809 RMTTRAIL ./dirdat/NP -- So we can have multiple mappings for a single table. This is a dangerous parameter! ALLOWDUPTARGETMAP -- FOR INSERTS - REPLACE MISSING COLUMNS getinserts ignoreupdates ignoredeletes TABLE NCHA.NEIL, TARGET NCHA.NEIL COLMAP (usedefaults, & C2 = @IF (@COLTEST (C2 , NULL, MISSING) , '1900-01-01:00:00:00.000000' , C2 ), & C3 = @IF (@COLTEST (C3 , NULL, MISSING) , '1900-01-01:00:00:00.000000' , C3 ), & ); -- FOR UPDATES - IGNORE MISSING COLUMNS -- We will do the deletes here too. If they are supplied as NULL they should be modified -- You may need to do a separate section for deletes depending upon your rules. ignoreinserts getupdates getdeletes TABLE NCHA.NEIL, TARGET NCHA.NEIL COLMAP (usedefaults, & C2 = @IF (@COLTEST (C2 , NULL) , '1900-01-01:00:00:00.000000' , C2 ), & C3 = @IF (@COLTEST (C3 , NULL) , '1900-01-01:00:00:00.000000' , C3 ), & ); -- And back to normal for subsequent table mappings getinserts getupdates getdeletes
is it possible to replicat a table rows only when specific columns are updated. the column which i am interested is a NULL value and this column is not primary key. and more importantly these updates has to be converted into inserts into target,
LikeLike
The short answer is yes, but it’s a little complicated and you will need to test carefully.
Your TABLE mapping in the EXTRACT will need a FILTER option to only extract the records you need to be replicated, probably using @COLTEST function in the filter to determine if the column is null (and the column is present).
In the Replicat you can use INSERTUPDATES to convert updates into insert statements.
You will need to ensure you have appropriate levels of supplemental logging on in the source database to ensure you capture all of the columns you need into the trail files so they are there to be inserted.This will add overhead to your redo logs.
LikeLike
Can you please help in case I need to skip a row insert/update on target when a column is missing from inserts? Many thanks in Advance.
LikeLike
I don’t have time to spin up a proof, but I would expect you could put a FILTER on your table definition (in the extract or replicat) with a @COLTEST condition.
Something like:
TABLE emp, FILTER (@COLTEST(emp_no,PRESENT))
should only map the table when column “emp_no” is present.
LikeLike
Hi Neil,
I have table column that is nullable and defaults to a value “A”. When a different value “B” is inserted to the column on source, it is not reflecting on the target databases but inserting the default value “A”. Supplemental logging enabled on All columns. Any suggestions to overcome this issue. Thanks in advance.
LikeLike
Is the data being inserted being captured in the trail file? You can check by reading the trail file using logdump.
If it is being captured, it is being manipulated by the replicat definition?
LikeLike