Oracle SQL Monitor not monitoring my SQL

I needed to monitor a SQL statement in 11.2.0.3 (the limits mentioned below are the same in 12.1, 12.2, 18.4 and 19C) to determine what is was doing and why it was slow.sql_monitor

Usually I would use SQL Monitor [NOTE: You need to license the Oracle Tuning Pack to use SQL Monitor] for this but the SQL was not appearing in there, despite running for over 5 seconds, and being a parallel SQL (both of which qualify to be included in SQL Monitor). So I asked Twitter why, and thought I’d share the output here.

https://twitter.com/ChandlerDBA/status/1075692070952677376

It was nailed immediately by Jonathan Lewis, with added help from Ivica Arsov. (thank you!)

There is a hidden parameter “_sqlmon_max_planlines” which states that any SQL with a plan in excess of 300 lines should not be monitored (see below for SQLMon hidden parameters – and change them at your own risk, preferably with the backing of an SR from Oracle Support). This execution plan had well over 300 lines. The solution is to change either the session or the system to allow monitoring to happen when the plan is over 300 lines.

e.g.

alter system  set "_sqlmon_max_planlines"=500 scope=memory sid='*';
or
alter session set "_sqlmon_max_planlines"=500;

The negative side effect it that the monitoring will use more resources (primarily memory and CPU), which is why there are default limits on this feature. You might want to change it back when you’re finished to conserve resources.

Note that if you change the system parameter whilst the SQL is running, it will start to monitor the SQL at that point, so you will only get a partial picture of what is taking place, which is less valuable.

select ksppinm, ksppstvl, ksppdesc
  from sys.x$ksppi a, sys.x$ksppsv b
 where a.indx=b.indx
  and lower(ksppinm) like lower('%sqlmon%')
order by ksppinm;

KSPPINM                   KSPPSTVL  KSPPDESC
------------------------- --------- --------------------------------------------------------------------------------
_sqlmon_binds_xml_format  default   format of column binds_xml in [G]V$SQL_MONITOR
_sqlmon_max_plan          640       Maximum number of plans entry that can be monitored. Defaults to 20 per CPU
_sqlmon_max_planlines     300       Number of plan lines beyond which a plan cannot be monitored
_sqlmon_recycle_time      60        Minimum time (in s) to wait before a plan entry can be recycled
_sqlmon_threshold         5         CPU/IO time threshold before a statement is monitored. 0 is disabled

You may also notice a few other parameters in there. The “_sqlmon_recycle_time” hows the amount of time that the SQLMon plan will be guaranteed to be retained. Any retention time after that will be a bonus and depend upon the amount of SQL needing to be monitored. I see monitoring plans disappearing after 2-3 minutes in some systems, so you need to be quick, and you should save the plans down to disk.

save_sqlmon

The mad thing is that I was aware of this restriction before I posted by request for help on Twitter but I’d completely forgotten about it. So here’s the blog post to help me remember!

Advertisement

Checking the alert log – the easy way

Do you check the alert log of your databases every day? In the morning when you get in? But what about the alerts which happen during the day? How do you spot them – especially if you don’t have Grid Control or Cloud Control configured. Even if you do have a full monitoring solution, this can be useful for a belt-and-braces approach.

Here’s a short bash shell script to use adrci to read through each ORACLE_HOME (for a DIAG location) and check every alert log contained therein, using adrci pattern matching functionality to search for problems. I usually schedule it within each host (using cron) to minimise the moving parts, and therefore minimise the opportunity for it to stop working. Any problems, and I get an email. I hope you find it useful. I usually keep it in /opt/oracle/bin, but you stick it in your script home of choice.

This should work for 11G and 12C database (tested to 12.1.0.2), unless I’ve made a cut/paste error 🙂

#!/bin/bash
#########################################################################################
# Description: Read each Oracle Home directory. Run adrci matching for problems
# Author : N Chandler.2014-03-28
#
# crontab : # Check Alert Log 30.03.2014
# 00,30 * * * * /opt/oracle/bin/adrci_alert.sh > /opt/oracle/bin/log/adrci.cron.log 2>&1
#
#########################################################################################
# Which HOME?
 export ORACLE_HOME=/opt/app/oracle/product/11g
 export DIAG_LOC=/opt/app/oracle/diag/rdbms
# Who gets the alert?
 export RECIPIENT='neil@chandler.uk.com'
# Other Variables
 export LD_LIBRARY_PATH=$ORACLE_HOME/lib
 export HOST=`hostname`
 export PATH=$ORACLE_HOME/bin:$PATH
 export NLS_DATE_FORMAT='yyyy-mm-dd hh24:mi:ss'
 export SUBJECT="Oracle ALERTS on ${HOST} OK"
 export LOG=/tmp
 export ALERT=$LOG/error.txt

# Write the alert log message header for the email
 echo "${HOST} `date +%Y-%m-%d.%H:%M:%S-%Z`" > ${ALERT}
 echo "All alerts in ADRCI Alert log for the last 30 minutes" >> ${ALERT}
 echo "THIS ALERT WILL NOT BE REPEATED!!! TAKE ACTION NOW!!!" >> ${ALERT}
 echo "Follow-up on this email and check the alert log on ${HOST}" >> ${ALERT}

# find out the homes
 adrci_homes=( $(adrci exec="show homes" | grep -e rdbms -e asm))

# run through Each home found and examine the alert log
# Here we are looking for ORA- messges, Deadlock, anything which raises an incident or anything which is instance-level
# IN THE LAST 30 MINUTES (1/48), so we need to run this code every 30 minutes or we may miss something. 
 for adrci_home in ${adrci_homes[@]}
 do
   echo "Checking: ${adrci_home}" >> ${ALERT}
   echo $adrci_home' Alert Log' >> ${ALERT}
   adrci exec="set home ${adrci_home} ; show alert -p \\\"(message_text like '%ORA-%' or message_text like '%Deadlock%' or message_text like '%instance%' or message_text like '%incident%') and originating_timestamp>=systimestamp-(1/48) \\\"" -term >>${ALERT}
 done
# count the errors. This is a good place to exclude specific errors you wish to ignore with a -v match.
# note - your grep must be aligned with the pattern match above for this to work
num_errors=`grep -c -e 'TNS' -e 'ORA' -e 'Deadlock' -e 'instance' -e 'incident' ${ALERT} | grep -v 'ORA-28'`

# If there are any errors, lets email the alert information to someone
if [ $num_errors -gt 0 ]
then
  SUBJECT="ERROR in Oracle ALERT log on ${HOST}"
  mail -s "${SUBJECT}" ${RECIPIENT} < ${ALERT}
fi