Oracle 12C Problem with datapatch. Part 2, the “fix”

so after much too-ing and fro-ing with Oracle Support, we finally have a solution to my datapatch problem. Prevent the timeout from happening when running datapatch! We know which part of the code was timing out, and we can do this by setting an undocumented event: 18219841.

Here’s a show test to show you how that was done:

SQL: select dbms_sqlpatch.verify_queryable_inventory from dual;

VERIFY_QUERYABLE_INVENTORY
--------------------------
ORA-20008: timed out. Job Load_opatch_inventory_1execution time is more than 120Secs

(wait a while for the background job to clean up)

SQL: alter session set events '18219841 trace name context forever';

Session Altered

SQL: select dbms_sqlpatch.verify_queryable_inventory from dual;

VERIFY_QUERYABLE_INVENTORY
--------------------------
OK

patch 19769480 installed in all nodes
patch 20299023 installed in all nodes
patch 28031110 installed in all nodes

and switch it back off

alter session set events '18219841 trace name context off';

You may note above that I was changing the context for the session only, as a test. To get it to work with datapatch, you will need to use “alter system” commands before and after running datapatch.

I would recommend that you do not set this event without first contacting Oracle Support. You may wish to quote “Bug 21121707” I would recommend you un-set it afterwards too – you don’t know what it’s doing and you don’t understand if there are any negative side-effects.

To check if the event is set in your session/system:

in sqlplus...

sql: var level number
sql: exec dbms_system.read_ev(1821984, :level);
sql: print :level

              LEVEL 
-------------------
                  1

Addendum: After a twitter conversation, it was noted by Martin Berger (@martinberx) that the nature of the query had changed with the event set, and that it was no longer calling the scheduler jobs. I did note, however, that it’s still calling the same pre-processor, but without the timeout (so it will run until it’s finished, however long that may be): $ORACLE_HOME/QOPatch/qopiprep.bat  – It’s getting information from GV$QPX_INVENTORY, whatever that view does. Answers on a post card (or comment) please.

Advertisement

Oracle 12C Problem with datapatch

I was minding my business, patching an Oracle 12C RAC Cluster when the darned patch failed

./datapatch -verbose

SQL Patching tool version 12.1.0.2.0 on Mon Aug  3 20:47:31 2015
Copyright (c) 2015, Oracle.  All rights reserved.
Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_9841_2015_08_03_20_47_31/sqlpatch_invocation.log

Connecting to database...OK
Bootstrapping registry and package to current versions...done

Queryable inventory could not determine the current opatch status.
Execute 'select dbms_sqlpatch.verify_queryable_inventory from dual'
and/or check the invocation log
/u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_9841_2015_08_03_20_47_31/sqlpatch_invocation.log
for the complete error.
Prereq check failed, exiting without installing any patches.

Please refer to MOS Note 1609718.1 and/or the invocation log
/u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_9841_2015_08_03_20_47_31/sqlpatch_invocation.log
for information on how to resolve the above errors.

SQL Patching tool complete on Mon Aug 3 20:49:37 2015

So, following the instructions, I ran the query used to ID the inventory

[oracle@rac12c01 OPatch]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Mon Aug 3 20:59:37 2015

Copyright (c) 1982, 2014, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

20:59:37 SYS @ ORCL1 > select dbms_sqlpatch.verify_queryable_inventory from dual;

VERIFY_QUERYABLE_INVENTORY
--------------------------
ORA-20008: Timed out, Job Load_opatch_inventory_2execution time is more than 120Secs

op_scpt_path :/u01/app/oracle/product/12.1.0/dbhome_1/QOpatchpos1

The code has timed out! So, there are some clues about what’s going on with this code. This procedure has a preprocessor, which needs to get information into the database. To do this, it invokes a scheduler job [Load_opatch_inventory_n where ‘n’ represents the cluster node executing the command], to pull in information from the o/s. Here’s a job running, with it’s definition – note the job name is mentioned in the failure above:

select job_name,state,job_action from dba_scheduler_jobs where job_name like '%PATCH%' order by job_name;
JOB_NAME                STATE    JOB_ACTION
----------------------- -------- ----------
LOAD_OPATCH_INVENTORY   DISABLED BEGIN dbms_qopatch.opatch_inv_refresh_job(); END;
LOAD_OPATCH_INVENTORY_1 DISABLED BEGIN dbms_qopatch.opatch_inv_refresh_job(); END;
LOAD_OPATCH_INVENTORY_2 RUNNING  BEGIN dbms_qopatch.opatch_inv_refresh_job(); END;

and dbms_qopatch.opatch_inv_refresh_job() also invokes a preprocessor. This preprocessor is a batch job. A shell script in /u01/app/oracle/product/12.1.0/dbhome_1/QOpatch called qopiprep.bat And this shell script basically runs…

opatch lsinventory -xml

to get the current patches, formatted as XML. It then tacks a delimiter onto the end and returns it to the calling job before cleaning up after itself. This command can be slow-ish to execute, but on the problem cluster in question, it was running particularly slowly.

This command runs on all nodes, which return the output to the calling job, which then has to process the inbound XML files and determine if all is OK in the world of opatch before continuing. You have 2 minutes. GO! If you don’t finish in 2 minutes, the who thing fails, with no obvious remedial action possible.

One option you have, should you encounter this problem and be desperate for a workaround is to preprocess the XML file, and just get the batch file to return your handywork:

NOTE: This is very much a one-off to keep things moving and you must return the preprocessor code to it’s normal state and raise a call with Oracle Support to get this fixed properly. Do this workaround at your own risk. I don’t recommend it. Raise a call with Oracle Support instead.

save the pre-processor to put back later:

cp -p qopiprep.bat qopiprep.bat.sav

run the commands in the file:

$ORACLE_HOME/OPatch/opatch lsinventory -xml $ORACLE_HOME/QOpatch/xml_file.xml -retry 0 -invPtrLoc $ORACLE_HOME/oraInst.loc >> $ORACLE_HOME/QOpatch/stout.txt

tack the delimiter string onto the end of the XML file

echo "UIJSVTBOEIZBEFFQBL" >> $ORACLE_HOME/QOpatch/xml_file.xml

edit the sciprt (qopiprep.bat) and remove the bottom 5 commands EXCEPT the “echo `cat…” bit, which returns the output to the calling code:

# Option: "-all" gives information on all Oracle Homes
# installed in the central inventory. With that information, the
# patches of non-RDBMS homes could be fetched.
$ORACLE_HOME/OPatch/opatch lsinventory -xml $ORACLE_HOME/QOpatch/xml_file.xml -retry 0 -invPtrLoc $ORACLE_HOME/oraInst.loc >> $ORACLE_HOME/QOpatch/stout.txt
`echo "UIJSVTBOEIZBEFFQBL" >> $ORACLE_HOME/QOpatch/xml_file.xml`
echo `cat $ORACLE_HOME/QOpatch/xml_file.xml`
rm $ORACLE_HOME/QOpatch/xml_file.xml
 rm $ORACLE_HOME/QOpatch/stout.txt

Now when you run, the dbms_sqlpatch command, you have taken the opatch lsinventory slow aspect of the preprocessor out of the equation:

21:04:13 SYS @ ORCL1 > select dbms_sqlpatch.verify_queryable_inventory from dual;
VERIFY_QUERYABLE_INVENTORY
--------------------------
OK
op_scpt_path :/u01/app/oracle/product/12.1.0/dbhome_1/QOpatchpos1
patch 19769480 installed in all nodes
patch 20299023 installed in all nodes
patch 20831110 installed in all nodes

don’t forget to put the correct processing code back once the datapatch -verbose has been completed successfully. And you need to clean up all of the output files you created, otherwise the code will fail again.

Hey, let’s be careful out there.