Goldengate Data Manipulation – When Inserts & Updates differ

One very useful aspect of Golden Gate is to allow the manipulation of data between the source and the destinations.

One recent problem that I encountered was to alter the data differently for inserts than for updates. This was caused by the receiving system needing to have some default data in columns which may or may not be supplied by the insert or update statements. This is slightly more complex than first imagined:

  • If we have an INSERT and the column value IS NULL, or the column value IS NOT SUPPLIED by the insert statement, we should set the default for that column.
  • If we have an UPDATE and the column value IS NULL, then we should set the default for that column.
    We must NOT set the default if the UPDATE does not supply the column, otherwise we may incorrectly overwrite data in the target system.

The first problem is that, by default, you are not allowed to have more than one table mapping per table. To get around this, you need to use the “ALLOWDUPTARGETMAP” parameter. You can then add multiple mappings.

You need to be aware that each mapping will fire for each transaction action. If you have 2 active table mappings for the same table, you will end up with 2 inserts/updates/deletes. Get this mapping wrong and your data integrity will be destroyed, and you will get a lot of constraint errors. In this case we have 2 mappings, one for inserts and one for updates and deletes. I need to use the get/ignore commands to indicate which actions each mapping should use.

The following example was for a data pump, but it is valid to do this for all extracts and replicats.

 

-- ggsci: add extract p_neil, exttrailsource /u02/gg/bin12/dirdat/NE, BEGIN NOW
-- ggsci: add rmttrail ./dirdat/NP extract p_neil megabytes 100

EXTRACT p_neil
USERID owner_goldengate@DB_LOCAL PASSWORD password
passthru
TARGETDEFS ./dirdef/defgen.neil.def
RMTHOST remote.server.world mgrport 7809
RMTTRAIL ./dirdat/NP

-- So we can have multiple mappings for a single table. This is a dangerous parameter!
ALLOWDUPTARGETMAP

-- FOR INSERTS - REPLACE MISSING COLUMNS
getinserts
ignoreupdates
ignoredeletes

TABLE NCHA.NEIL, TARGET NCHA.NEIL
COLMAP (usedefaults, &
C2 = @IF (@COLTEST (C2 , NULL, MISSING) , '1900-01-01:00:00:00.000000' , C2 ), &
C3 = @IF (@COLTEST (C3 , NULL, MISSING) , '1900-01-01:00:00:00.000000' , C3 ), &
);

-- FOR UPDATES - IGNORE MISSING COLUMNS
-- We will do the deletes here too. If they are supplied as NULL they should be modified
-- You may need to do a separate section for deletes depending upon your rules.
ignoreinserts
getupdates
getdeletes

TABLE NCHA.NEIL, TARGET NCHA.NEIL
COLMAP (usedefaults, &
C2 = @IF (@COLTEST (C2 , NULL) , '1900-01-01:00:00:00.000000' , C2 ), &
C3 = @IF (@COLTEST (C3 , NULL) , '1900-01-01:00:00:00.000000' , C3 ), &
);

-- And back to normal for subsequent table mappings
getinserts
getupdates
getdeletes
Advertisement

6 Responses to Goldengate Data Manipulation – When Inserts & Updates differ

  1. Mohammed Aashiq says:

    is it possible to replicat a table rows only when specific columns are updated. the column which i am interested is a NULL value and this column is not primary key. and more importantly these updates has to be converted into inserts into target,

    Like

    • The short answer is yes, but it’s a little complicated and you will need to test carefully.

      Your TABLE mapping in the EXTRACT will need a FILTER option to only extract the records you need to be replicated, probably using @COLTEST function in the filter to determine if the column is null (and the column is present).

      In the Replicat you can use INSERTUPDATES to convert updates into insert statements.

      You will need to ensure you have appropriate levels of supplemental logging on in the source database to ensure you capture all of the columns you need into the trail files so they are there to be inserted.This will add overhead to your redo logs.

      Like

  2. Vamshi says:

    Can you please help in case I need to skip a row insert/update on target when a column is missing from inserts? Many thanks in Advance.

    Like

    • I don’t have time to spin up a proof, but I would expect you could put a FILTER on your table definition (in the extract or replicat) with a @COLTEST condition.

      Something like:
      TABLE emp, FILTER (@COLTEST(emp_no,PRESENT))

      should only map the table when column “emp_no” is present.

      Like

  3. Nagarjuna says:

    Hi Neil,

    I have table column that is nullable and defaults to a value “A”. When a different value “B” is inserted to the column on source, it is not reflecting on the target databases but inserting the default value “A”. Supplemental logging enabled on All columns. Any suggestions to overcome this issue. Thanks in advance.

    Like

    • Is the data being inserted being captured in the trail file? You can check by reading the trail file using logdump.
      If it is being captured, it is being manipulated by the replicat definition?

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.