Oracle Table Prefs

V. Quick post for me really: Some SQL to tell me all of the Table Prefs for any specific table (All Oracle 19 prefs)
You could always just query DBA_TAB_STAT_PREFS

select * from user_tab_stat_prefs where table_name = '&&TABLE';

But that only tells you what you have explicitly set for a table, which will override any GLOBAL prefs. The order that prefs are used is TABLE -> GLOBAL -> DEFAULT. If some DBA has changed a global pref, how would you know that it’s affecting this particular table?

Here’s a quick piece of SQL to do that, which formats nice and readably (for me!)

undefine TABLE
prompt Enter Table Name 
prompt &&TABLE

select rpad('ANDV_ALGO_INTERNAL_OBSERVE : ',42)||dbms_stats.get_prefs(pname=>'ANDV_ALGO_INTERNAL_OBSERVE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('APPROXIMATE_NDV : ',42)||dbms_stats.get_prefs(pname=>'APPROXIMATE_NDV', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('APPROXIMATE_NDV_ALGORITHM : ',42)||dbms_stats.get_prefs(pname=>'APPROXIMATE_NDV_ALGORITHM', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('AUTO_STAT_EXTENSIONS : ',42)||dbms_stats.get_prefs(pname=>'AUTO_STAT_EXTENSIONS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('AUTOSTATS_TARGET : ',42)||dbms_stats.get_prefs(pname=>'AUTOSTATS_TARGET', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('AUTO_TASK_INTERVAL : ',42)||dbms_stats.get_prefs(pname=>'AUTO_TASK_INTERVAL', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('AUTO_TASK_MAX_RUN_TIME : ',42)||dbms_stats.get_prefs(pname=>'AUTO_TASK_MAX_RUN_TIME', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('AUTO_TASK_STATUS : ',42)||dbms_stats.get_prefs(pname=>'AUTO_TASK_STATUS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('CASCADE : ',42)||dbms_stats.get_prefs(pname=>'CASCADE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('CONCURRENT : ',42)||dbms_stats.get_prefs(pname=>'CONCURRENT', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('COORDINATOR_TRIGGER_SHARD : ',42)||dbms_stats.get_prefs(pname=>'COORDINATOR_TRIGGER_SHARD', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('DEBUG : ',42)||dbms_stats.get_prefs(pname=>'DEBUG', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('DEGREE : ',42)||dbms_stats.get_prefs(pname=>'DEGREE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('ENABLE_HYBRID_HISTOGRAMS : ',42)||dbms_stats.get_prefs(pname=>'ENABLE_HYBRID_HISTOGRAMS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('ENABLE_TOP_FREQ_HISTOGRAMS : ',42)||dbms_stats.get_prefs(pname=>'ENABLE_TOP_FREQ_HISTOGRAMS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('ESTIMATE_PERCENT : ',42)||dbms_stats.get_prefs(pname=>'ESTIMATE_PERCENT', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('GATHER_AUTO : ',42)||dbms_stats.get_prefs(pname=>'GATHER_AUTO', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('GATHER_SCAN_RATE : ',42)||dbms_stats.get_prefs(pname=>'GATHER_SCAN_RATE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('GLOBAL_TEMP_TABLE_STATS : ',42)||dbms_stats.get_prefs(pname=>'GLOBAL_TEMP_TABLE_STATS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('GRANULARITY : ',42)||dbms_stats.get_prefs(pname=>'GRANULARITY', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('INCREMENTAL : ',42)||dbms_stats.get_prefs(pname=>'INCREMENTAL', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('INCREMENTAL_INTERNAL_CONTROL : ',42)||dbms_stats.get_prefs(pname=>'INCREMENTAL_INTERNAL_CONTROL', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('INCREMENTAL_LEVEL : ',42)||dbms_stats.get_prefs(pname=>'INCREMENTAL_LEVEL', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('INCREMENTAL_STALENESS : ',42)||dbms_stats.get_prefs(pname=>'INCREMENTAL_STALENESS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('JOB_OVERHEAD : ',42)||dbms_stats.get_prefs(pname=>'JOB_OVERHEAD', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('JOB_OVERHEAD_PERC : ',42)||dbms_stats.get_prefs(pname=>'JOB_OVERHEAD_PERC', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('MAINTAIN_STATISTICS_STATUS : ',42)||dbms_stats.get_prefs(pname=>'MAINTAIN_STATISTICS_STATUS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('METHOD_OPT : ',42)||dbms_stats.get_prefs(pname=>'METHOD_OPT', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('NO_INVALIDATE : ',42)||dbms_stats.get_prefs(pname=>'NO_INVALIDATE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('OPTIONS : ',42)||dbms_stats.get_prefs(pname=>'OPTIONS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('PREFERENCE_OVERRIDES_PARAMETER : ',42)||dbms_stats.get_prefs(pname=>'PREFERENCE_OVERRIDES_PARAMETER', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('PUBLISH : ',42)||dbms_stats.get_prefs(pname=>'PUBLISH', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('ROOT_TRIGGER_PDB : ',42)||dbms_stats.get_prefs(pname=>'ROOT_TRIGGER_PDB', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('SCAN_RATE : ',42)||dbms_stats.get_prefs(pname=>'SCAN_RATE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('STALE_PERCENT : ',42)||dbms_stats.get_prefs(pname=>'STALE_PERCENT', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('STAT_CATEGORY : ',42)||dbms_stats.get_prefs(pname=>'STAT_CATEGORY', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('SYS_FLAGS : ',42)||dbms_stats.get_prefs(pname=>'SYS_FLAGS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('TABLE_CACHED_BLOCKS : ',42)||dbms_stats.get_prefs(pname=>'TABLE_CACHED_BLOCKS', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('TRACE : ',42)||dbms_stats.get_prefs(pname=>'TRACE', tabname=>'&&TABLE') prefs_for__&&TABLE FROM dual UNION ALL
select rpad('WAIT_TIME_TO_UPDATE_STATS : ',42)||dbms_stats.get_prefs(pname=>'WAIT_TIME_TO_UPDATE_STATS', tabname=>'&&TABLE') prefs_for__&&TABLE
FROM dual;

Sample Output


PREFS_FOR__INTERVAL_TAB
------------------------
ANDV_ALGO_INTERNAL_OBSERVE :              FALSE
APPROXIMATE_NDV :                         TRUE
APPROXIMATE_NDV_ALGORITHM :               REPEAT OR HYPERLOGLOG
AUTO_STAT_EXTENSIONS :                    OFF
AUTOSTATS_TARGET :                        AUTO
AUTO_TASK_INTERVAL :                      900
AUTO_TASK_MAX_RUN_TIME :                  3600
AUTO_TASK_STATUS :                        OFF
CASCADE :                                 DBMS_STATS.AUTO_CASCADE
CONCURRENT :                              OFF
COORDINATOR_TRIGGER_SHARD :               FALSE
DEBUG :                                   0
DEGREE :                                  NULL
ENABLE_HYBRID_HISTOGRAMS :                3
ENABLE_TOP_FREQ_HISTOGRAMS :              3
ESTIMATE_PERCENT :                        DBMS_STATS.AUTO_SAMPLE_SIZE
GATHER_AUTO :                             AFTER_LOAD
GATHER_SCAN_RATE :                        HADOOP_ONLY
GLOBAL_TEMP_TABLE_STATS :                 SESSION
GRANULARITY :                             AUTO
INCREMENTAL :                             TRUE
INCREMENTAL_INTERNAL_CONTROL :            TRUE
INCREMENTAL_LEVEL :                       PARTITION
INCREMENTAL_STALENESS :                   USE_STALE_PERCENT,USE_LOCKED_STATS
JOB_OVERHEAD :                            -1
JOB_OVERHEAD_PERC :                       1
MAINTAIN_STATISTICS_STATUS :              FALSE
METHOD_OPT :                              FOR ALL COLUMNS SIZE 50
NO_INVALIDATE :                           DBMS_STATS.AUTO_INVALIDATE
OPTIONS :                                 GATHER
PREFERENCE_OVERRIDES_PARAMETER :          FALSE
PUBLISH :                                 TRUE
ROOT_TRIGGER_PDB :                        FALSE
SCAN_RATE :                               0
STALE_PERCENT :                           10
STAT_CATEGORY :                           OBJECT_STATS, REALTIME_STATS
SYS_FLAGS :                               1
TABLE_CACHED_BLOCKS :                     1
TRACE :                                   0
WAIT_TIME_TO_UPDATE_STATS :               15

Parallel Madness

usain-bolt-1I’ve noticed at a few clients with data warehouses recently that the Developers and, upon occasion, Business Users have a real fondness for hinting the SQL they are producing with one particular hint. PARALLEL.

As any fule kno, this IS the magic go-faster hint. PARALLEL(2) is obviously twice as fast as serial execution. PARALLEL(4) is amazing and PARALLEL(64) like Usain Bolt on Red Bull.

The problem is that, like all database features, parallel query comes with a cost.

When you specify /*+ PARALLEL(n) */ you are telling the optimizer that it has a lot more resources to use to complete this particular query. Not a single thread but many. PARALLEL(10) will use 21 processes to complete its execution – 20 Parallel Execution Server (10 producers, 10 consumers) and a coordinator (which is your connections shadow process) which will deal with any aspects of the parallel plan which cannot be parallelised.

Allowed free reign to use PARALLEL, devs and users will quickly consume all of the resources available on a given server, causing contentions which will inevitably slow down the overall execution of every piece of code executing on there. To illustrate this, I’d like to use an example I came across a while ago to show how excess PARALLEL of a single statement can be problematic itself.

Lets say I have a single server with 16 cores, lots of memory and a decent SSD array so the problem will centre around the CPU. Inevitably your 16 cores will be hyperthreaded. This then looks to Oracle like you have 32 cores. Whilst Oracle knows you have 16 hyperthreaded cores, you get CPU_COUNT=32

NOTE: 16 cores hyperthreaded DO NOT have the power of 32 cores, especially when dealing with databases. Some database workloads are actually WORSE with hyperthreading enabled (e.g. Data Warehouse systems on SQL Server). Inevitably the server admins will have enabled it unless you can provide cast-iron evidence to have it disabled.

I have a statement which the users are complaining about. It starts with the  following code: SELECT /*+ PARALLEL */ (complex multi-table join)

So what does this unrestricted (and therefore DEFAULT!) degree of parallelism (DOP) do in this case?
The default DOP is defined as PARALLEL_THREADS_PER_CPU x CPU_COUNT=2 x 32 = PARALLEL(64)
On RAC it is PARALLEL_THREADS_PER_CPU x CPU_COUNT x INSTANCE_COUNT!

Lets have a look at the ACTIVITY of this PARALLEL(64) query:

parallel_64_activity

You can see from the screenshot that Oracle knows there are 16 cores but it has gone PARALLEL(64), using 128 parallel exection slaves and fully expecting to have the available resources to run PARALLEL(64) efficiently. The execution plan is calculated around this assumption. There are 64 parallel execution slaves attempting to work on this at the same time. It’s worth looking at the metrics associated with this query.

Peaks of 2GB/s disk, 140GB of TEMP and 32 CPU’s.

parallel_64_metrics

The query took 36.9 minutes to complete.

I had the query changed to inject a modicum of realism into the available resources at the time of the run, and restricted the DOP to PARALLEL(8).

parallel_8_activity

Oracle is restricted to the limited amount of resource, which is availble. The execution plan is different, to reflect the lower amount of available resources. Looking at the metrics:

Peaks of 1GB/s, 3GB of TEMP and 12 CPU’s.

parallel_8_metrics

The query took 10.3 minutes to complete. 3 times quicker!

It is worth noting that testing the query in isolation with PARALLEL(16) took 7 minutes to complete, but that DOP would have resource-starved the server as a whole causing everything else currently executing to slow down, and was discounted as an option.

With PARALLEL, less can be better.
Using PARALLEL for everything can be counter-productive.
Co-ordinating PARALLEL across multiple RAC nodes can be disasterous without careful design to take advantage of this feature (using PARALLEL_LOCAL_FORCE=TRUE will restrict parallel processing to a single node). Oracle recommend you don’t do this. Other opinions are available and I generally recommend setting this to TRUE.

We have a limited amount of resources on our servers. Working within those resource limitations will provide substantial benefits.

Before using Parallel processing, I’d recommend thoroughly reading the VLDB and Partitioning Guide appropriate to your database release!

If you use PARALLEL:

  • Use Resource Manager to restrict DOP appropriately for different groups of users.
  • Consider setting PARALLEL_LOCAL_FORCE=TRUE
  • Consider setting PARALLEL_THREAD_PER_CPU=1, especially where you have hyperthreading enabled.
  • Consider your overall workload, not just the SQL you are working with right now.

 

Where’s my Oracle SQL Plan Baseline?

saeed-mhmdi-130222-unsplashNo so long ago I was having fun creating SQL Plan Baselines in a old 11.2.0.3 database due to be decomissioned but which needs to keep running for a while (no doubt several years) – so minimal time/money to be expended on it. Then, one day, I couldn’t create a baseline and needed to figure out why…

We get the occasional painful plan change, so pinning down an acceptable historic plan using a Baseline is becoming a regular occurrence. The standard route for this is:

1. Notice plan isn’t good and that we have been running with a good plan historically
2. Identify an plan hash value which is acceptable
3. Load it from the library cache if that plan is in there (unlikely)

[ Kerry Osborne has a useful blog post/script for this so I won’t reproduce here: http://kerryosborne.oracle-guy.com/2012/03/displaying-sql-baseline-plans/ ]

4. If not, create a SQL Tuning Set from AWR with that specific plan and convert the SQL Tuning Set into a Baseline. You will need the begin-and-end snap id’s containing the plan, the sql_id and the plan_hash_value for the specific plan you want to baseline:

Please ensure you have suitable licensing before running any of the following code!

declare
 baseline_ref_cur DBMS_SQLTUNE.SQLSET_CURSOR;
begin
 DBMS_SQLTUNE.CREATE_SQLSET('NEIL');
 open baseline_ref_cur for
 select VALUE(p) from table(
 DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(&begin_snap_id, &end_snap_id,'sql_id='||CHR(39)||'4jcxvz3adqbs2'||CHR(39)||'and plan_hash_value=1905606778',NULL,NULL,NULL,NULL,NULL,NULL,'ALL')) p;
 DBMS_SQLTUNE.LOAD_SQLSET('NEIL', baseline_ref_cur);
end;
/

Did we get it?

select * from dba_sqlset_statements where sqlset_name = 'NEIL';

SQLSET_NAME, SQLSET_OWNER, SQLSET_ID, SQL_ID,       FORCE_MATCHING_SIGNATURE, SQL_TEXT,                                   PARSING_SCHEMA_NAME, PLAN_HASH_VALUE ...
----------------------------------------------------------------------------------------------------------------------------------------------------------
NEIL         NEIL_DBA             35  4jcxvz3adqbs2 6134983393191283611       INSERT WHEN CNT = 1 THEN INTO SOME_TABLE...  APP                  1905606778      ...

Cool! I have captured the SQL. Lets create the baseline using: dbms_spm.load_plans_from_sqlset

declare
cnt number;
begin
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
end;
/
... and I get ...

Plans Loaded : 0

Zero plans? So what plans are in there? Has my plan actually appeared?

select SQL_HANDLE,PLAN_NAME,CREATOR,ORIGIN,PARSING_SCHEMA_NAME, ENABLED,ACCEPTED,REPRODUCED,CREATED
  from dba_sql_plan_baselines where origin like 'MANUAL-LOAD%' order by created desc;

SQL_HANDLE           PLAN_NAME                      CREATOR   ORIGIN      PARSING_SCHEMA_NAME  ENABLED  ACCEPTED  REPRODUCED  CREATED
-----------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_23829f7bf3712438 SQL_PLAN_270nzggtr291s1a6ce30c NEIL_DBA  MANUAL-LOAD APP                  YES      YES       YES         2018-12-31 00.00.00.000000000

No! But how do I prove that? Baselines don’t have the SQL ID associated with them?

Here’s some code from [blog post here: http://oracleprof.blogspot.com/2011/07/how-to-find-sqlid-and-planhashvalue-in.html ] which will take the SQL_Handle and display the SQL ID and Plan Hash Value (you may need an explicit grant to use DBMS_CRYPTO in your PDB for this to work):

For SQL Handle: SQL_23829f7bf3712438 

declare
v_sqlid VARCHAR2(13);
v_num number;
BEGIN
 dbms_output.put_line('SQL_ID '||' '|| 'PLAN_HASH_VALUE' || ' ' || 'SQL_HANDLE ' || ' ' || 'PLAN_NAME');
 dbms_output.put_line('-------------'||' '|| '---------------' || ' ' || '------------------------------' || ' ' || '--------------------------------');
 for a in (select sql_handle, plan_name, trim(substr(g.PLAN_TABLE_OUTPUT,instr(g.PLAN_TABLE_OUTPUT,':')+1)) plan_hash_value, sql_text
from (select t.*, c.sql_handle, c.plan_name, c.sql_text from dba_sql_plan_baselines c, table(dbms_xplan.DISPLAY_SQL_PLAN_BASELINE(c.sql_handle, c.plan_name)) t
where c.sql_handle = '&sql_handle') g
where PLAN_TABLE_OUTPUT like 'Plan hash value%') loop
  v_num := to_number(sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),9,4)) || sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),13,4)),RPAD('x', 16, 'x'));
  v_sqlid := '';
  FOR i IN 0 .. FLOOR(LN(v_num) / LN(32))
  LOOP
   v_sqlid := SUBSTR('0123456789abcdfghjkmnpqrstuvwxyz',FLOOR(MOD(v_num / POWER(32, i), 32)) + 1,1) || v_sqlid;
  END LOOP;
 dbms_output.put_line(v_sqlid ||' ' || rpad(a.plan_hash_value,15) || ' ' || rpad(a.sql_handle,30) || ' ' || rpad(a.plan_name,30));
end loop;
end;
/

SQL_ID        PLAN_HASH_VALUE SQL_HANDLE                     PLAN_NAME
------------- --------------- ------------------------------ --------------------------------
7f4n59twq8mrj 3036703685      SQL_23829f7bf3712438           SQL_PLAN_270nzggtr291s1a6ce30c 

 

That baseline is definitely not mine. It’s for the wrong SQL ID!
The “Plans Loaded” message was correct when it said “0“!

Why? There was no error message, no output other than the number of plans loaded. That sucks dbms_spm!

I need to trace it. Do that by using dbms_spm.configure

(thank you Timur Akhmadeev for helping me! It’s not easy to google/MOS how to do this, hence this post so I don’t forget again!)

declare
cnt number;
begin
 dbms_spm.configure('spm_tracing',1);
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
 dbms_spm.configure('spm_tracing',0);
end;
/

Looking in the trace file, it tells you what went wrong:

*** 2019-01-01 12:00:00.007
load sts: STS=NEIL, owner=NEIL_DBA
load sts: cursor opened
load sts: sql_id=4jcxvz3adqbs2 phv=1905606778
load sts: plan has empty outline, skipping it
load sts: plans total=0 plans loaded=0

So why does my plan have an empty outline?

The SQL_TEXT began “INSERT WHEN CNT = 1 THEN INTO SOME_TABLE… “.

It’s a multi-table insert, inserting into different tables depending upon a condition.
Baselines for Multi-table Inserts are NOT supported by SPM.

https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9014.htm#SQLRF01604

I hope this helps you with your baseline creation troubleshooting!

Having baseline creation problems, you should check MOS article 789520.1

 

Oracle SQL Monitor not monitoring my SQL

I needed to monitor a SQL statement in 11.2.0.3 (the limits mentioned below are the same in 12.1, 12.2, 18.4 and 19C) to determine what is was doing and why it was slow.sql_monitor

Usually I would use SQL Monitor [NOTE: You need to license the Oracle Tuning Pack to use SQL Monitor] for this but the SQL was not appearing in there, despite running for over 5 seconds, and being a parallel SQL (both of which qualify to be included in SQL Monitor). So I asked Twitter why, and thought I’d share the output here.

https://twitter.com/ChandlerDBA/status/1075692070952677376

It was nailed immediately by Jonathan Lewis, with added help from Ivica Arsov. (thank you!)

There is a hidden parameter “_sqlmon_max_planlines” which states that any SQL with a plan in excess of 300 lines should not be monitored (see below for SQLMon hidden parameters – and change them at your own risk, preferably with the backing of an SR from Oracle Support). This execution plan had well over 300 lines. The solution is to change either the session or the system to allow monitoring to happen when the plan is over 300 lines.

e.g.

alter system  set "_sqlmon_max_planlines"=500 scope=memory sid='*';
or
alter session set "_sqlmon_max_planlines"=500;

The negative side effect it that the monitoring will use more resources (primarily memory and CPU), which is why there are default limits on this feature. You might want to change it back when you’re finished to conserve resources.

Note that if you change the system parameter whilst the SQL is running, it will start to monitor the SQL at that point, so you will only get a partial picture of what is taking place, which is less valuable.

select ksppinm, ksppstvl, ksppdesc
  from sys.x$ksppi a, sys.x$ksppsv b
 where a.indx=b.indx
  and lower(ksppinm) like lower('%sqlmon%')
order by ksppinm;

KSPPINM                   KSPPSTVL  KSPPDESC
------------------------- --------- --------------------------------------------------------------------------------
_sqlmon_binds_xml_format  default   format of column binds_xml in [G]V$SQL_MONITOR
_sqlmon_max_plan          640       Maximum number of plans entry that can be monitored. Defaults to 20 per CPU
_sqlmon_max_planlines     300       Number of plan lines beyond which a plan cannot be monitored
_sqlmon_recycle_time      60        Minimum time (in s) to wait before a plan entry can be recycled
_sqlmon_threshold         5         CPU/IO time threshold before a statement is monitored. 0 is disabled

You may also notice a few other parameters in there. The “_sqlmon_recycle_time” hows the amount of time that the SQLMon plan will be guaranteed to be retained. Any retention time after that will be a bonus and depend upon the amount of SQL needing to be monitored. I see monitoring plans disappearing after 2-3 minutes in some systems, so you need to be quick, and you should save the plans down to disk.

save_sqlmon

The mad thing is that I was aware of this restriction before I posted by request for help on Twitter but I’d completely forgotten about it. So here’s the blog post to help me remember!

Stats Collection Time Anomaly

Johnathan Lewis (@JLOracle) recently published a short post about Stats Collection Time, talking about the table dba_optstat_operation (and dba_optstat_operation_tasks ), which reminded me about (what I regard as) an anomaly in the output in the NOTES columns in Oracle 12C.

I won’t repeat why it’s useful to check these tables as Johnathans note and @MDWidlakes’s comment here should give you all you need to know.

The DBA_OPTSTAT_OPERATION.NOTES column contains the parameters passed into the DBMS_STATS command, so you know what was done. It also reports the DEFAULT used by the DBMS_STATS job. Well, it does if you call DBMS_STATS explicitly, but the standard overnight auto job just says “default”. Why doesn’t is expand on that the way the explicit call does? If the default was changed between runs, you may end up with very different results but with no indication why. Am I missing something?

The following 2 rows of data show the output from each run. Note that the DEFAULT for METHOD_OPT in this database has been changed from “FOR ALL COLUMNS SIZE AUTO” to “FOR ALL COLUMNS SIZE REPEAT”** but was not explicitly passed-in for either run.

DBMS_STATS.GATHER_SCHEMA_STATS – decodes the DEFAULTs

OPERATION : gather_schema_stats            
TARGET    : MYSCHEMA     
START_TIME: 15-SEP-16 07.04.47 
END_TIME  : 15-SEP-16 07.09.02 
STATUS    : COMPLETED                                   
JOB_NAME  : 
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="NULL"/>
            <param name="estimate_percent" val="DBMS_STATS.AUTO_SAMPLE_SIZE"/>
            <param name="force" val="FALSE"/>
            <param name="gather_fixed" val="FALSE"/>
            <param name="gather_temp" val="FALSE"/>
            <param name="granularity" val="AUTO"/>
            <param name="method_opt" val="FOR ALL COLUMNS SIZE REPEAT"/>
            <param name="no_invalidate" val="NULL"/>
            <param name="options" val="GATHER"/>
            <param name="ownname" val="MYSCHEMA"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="statid" val=""/>
            <param name="statown" val=""/>
            <param name="stattab" val=""/>
            <param name="stattype" val="DATA"/>
            </params>

Autotask Overnight Gather – doesn’t decode the DEFAULTs

OPERATION : gather_database_stats (auto)   
TARGET    : AUTO       
START_TIME: 15-SEP-16 22.01.20 
END_TIME  : 15-SEP-16 22.38.40 
STATUS    : COMPLETED            
JOB_NAME  : ORA$AT_OS_OPT_SY_1212  
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="DEFAULT_DEGREE_VALUE"/>
            <param name="estimate_percent" val="DEFAULT_ESTIMATE_PERCENT"/>
            <param name="granularity" val="DEFAULT_GRANULARITY"/>
            <param name="method_opt" val="DEFAULT_METHOD_OPT"/>
            <param name="no_invalidate" val="DBMS_STATS.AUTO_INVALIDATE"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="stattype" val="DATA"/>
            </params>

 

**You should control the histograms that you need to maintain your schema stats as you would like Oracle to see them. The Oracle default approach of lots of histograms can be costly to maintain and store, and any stats which use Adaptive Sampling (all prior to 12C, and Height-Balanced / Hybrid in 12C onwards) carry a risk, especially in OLTP systems.
FOR ALL COLUMNS SIZE REPEAT was useful until Oracle12, when a change in the algorithm makes it dangerous, and it should NOT be used.  You should be using GLOBAL_STATS_PREFS and TABLE_STATS_PREFS to control your stats.

 

Accessing STATUS columns efficiently

A frequently reoccuring design problem with relational databases is the issue locating unprocessed rows in a large table, so we know which rows of data are still yet to be processed.

The problem with a STATUS column is that it generally has low cardinality; there are probably only a handful of distinct values [(C)omplete, (E)rror, (U)nprocessed or something like that]. Most records will be (C)omplete. This makes STATUS a poor candidate for standard B-Tree indexation. In a high throughput OLTP database, using bitmap indexes is probably not an option due to concurrency.

[Aside: When coding flag columns in Oracle, ALWAYS use a VARCHAR2(1 CHAR) {or CHAR(1 CHAR) if you prefer, but a CHAR is a VARCHAR2 under the covers and occupies the same number of bytes}. This is in preferance to a NUMBER(1). which occupies more bytes for a “1” than a “0”, so when you update it, you run the risk of row migration, chained rows and a performance hit. Frequently, ORM’s like Hibernate code for NUMBER by default. Override this!]

So what are my options? There’s a short list of possible table accesses for a low cardinality column.

1. Table scan. In an OLTP database where you only want a tiny fraction of the rows in the table, this would be a bad chouce.
2. Index the accessed columns and accept the inevitable INDEX_SCAN or FAST_FULL_INDEX_SCAN. This is not great and you probably need a Histogram on the column to convince the optimizer to use the index for your low frequency values. Otherwise you may be back to the table scan.
3. Make the “Complete” status “NULL”.
4. Uses a function-based index which makes the Complete status seems to be NULL for a specific query.

So what’s with options 3 and 4, why are they good, and how do we use them?

Unlike some RBDMS’s, Oracle does not store NULL values in it’s simple (non-composite) b-tree indexes. Therefore, if you choose Option (3) and make your “Complete” status be represented by a NULL, you will maintain an index on STATUS in which the only values that are stored are values you are interested in. This makes the index very sexy to the optimizer as it will generally be very tiny. However, we face one small problem. Convincing Developers that having a NULL as a valid status can be difficult. A NULL is a non-representative value. It is not supposed to represent anything. It means “I don’t know”. It doesn’t behave the same an normal values. This tends to freak out Developers and designers sometimes.

That’s where Option 4 comes in. If we wrap the index definition in a CASE statement, to produce a function-based index, we have have a highly specific tailored index on our table. If the SQL predicate matches the query exactly, we get a serious performance payoff.

But don’t take my word for it. Here’s a worked example from my laptop:

 
Here’s the table, it’s data distribution (16m rows, and a handful we care about)

NEIL @ ORCL01 > desc test_table
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                            NOT NULL NUMBER
 STATUS                        NOT NULL VARCHAR2(1 CHAR)
 DESCRIPTION                   NOT NULL VARCHAR2(100 CHAR)

NEIL @ ORCL01 > select status,count(*) from test_table group by status

S   COUNT(*)
- ----------
E         16
C   16777216
Y         32

 
Here are the indexes on the table, and their sizes. As you can see, the function-based index is absolutely tiny, making it as attractive to storage admins as it is to the optimizer.

- alter table test_table add constraint test_table_pk primary key (id);
- create index test_table_CASE on test_table (case status when 'Y' then status else null end);
- create index test_table_COVER_COMP on test_table (status, id) compress 1;
- create index test_table_STATUS on test_table (status) compress 1;



NEIL @ ORCL01 > select segment_name,segment_type,sum(bytes/1024) kb from user_extents 
where segment_name like 'TEST_TABLE%' 
group by segment_type,segment_name order by 2 desc,1;

SEGMENT_NAME               SEGMENT_TYPE               KB
-------------------------- ------------------ ----------
TEST_TABLE                 TABLE                  555008
TEST_TABLE_CASE            INDEX                      64
TEST_TABLE_COVER_COMP      INDEX                  658432
TEST_TABLE_PK              INDEX                  319488
TEST_TABLE_STATUS          INDEX                  413696

Some Index stats:
INDEX_NAME                DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTOR STATUS     NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------- ------------- ----------------------- ----------------------- ----------------- -------- ---------- ----------- ---------
TEST_TABLE_CASE                       1                       1                       6                 6 VALID            32          32 21-FEB-16
TEST_TABLE_COVER_COMP          16748149                       1                       1            125447 VALID      16748149      234974 21-FEB-16
TEST_TABLE_PK                  17003239                       1                       1             91391 VALID      17003239      492287 21-FEB-16
TEST_TABLE_STATUS                     3                   13828                   32011             96034 VALID      16257590      363295 21-FEB-16

 
Where we have a choice of useful indexes, we get a FAST FULL SCAN with a hefty cost. A histogram could have given us an index RANGE SCAN, which can be very good.
With no Histogram:

select id from test_table where status = 'Y';

Plan hash value: 1140618830

----------------------------------------------------------------------------------------------
| Id  | Operation            | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                       |       |       | 18753 (100)|          |
|*  1 |  INDEX FAST FULL SCAN| TEST_TABLE_COVER_COMP |  5592K|    42M| 18753   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

 
With a histogram in place on STATUS, you get a much better plan as the covering index avoids the need for the table look-up. You also get the risk that the optimizer may have bind variable peeking issues and other complications should we have lots of table joins.

select id from test_table where status = 'Y'

Plan hash value: 2912582684

------------------------------------------------------------------------------------------
| Id  | Operation        | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                       |       |       |     3 (100)|          |
|*  1 |  INDEX RANGE SCAN| TEST_TABLE_COVER_COMP |    32 |   256 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

NOTE: Ditching the covering index and just using the index on STATUS is pretty efficient too when combined with a histogram:

select id from test_table where status = 'Y'

Plan hash value: 2416598805

---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE        |    32 |   256 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_STATUS |    32 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


 
And now with the function-based index; having the case statement removing all statuses we are not interested-in for a tiny tidy index.

NOTE: The Predicate in the query must EXACTLY match the function-based index for it to be used.

select id from test_table where case status when 'Y' then status else null end = 'Y'

Plan hash value: 2073004851

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                 |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE      |    32 |   256 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_CASE |    32 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Conclusion: For a highly skewed STATUS column you need a histogram, which is something you should mostly avoid in OLTP systems using BIND variables. Having a highly focussed function-based index allows for a tiny self-maintaining index which is guaranteed to only be used for queries that you want it to be used for.
 

NOTE: The original idea behind using NULLS to minimise index size came from the performance expert, Jonathan Lewis. I have implemented both NULL-as-complete design and case-based indexes at several clients, in varying forms, and always to great success.

Pre-creating Interval Partitions

One of the major problems with interval-based partitioning is the statistics. Partitions appear dynamically as-needed based upon data being inserted or udpated, and any partition magically appearing isn’t going to have any statistics.

This is generally a stability issue as you will, at best, be using dynamic statistics for your optimizations. So how do we deal with it? My preferred method is to pre-create the partitions and copy statistics from a good partition into the pre-created partition. But how do we get the partitions to appear? Here’s 2 options:

  1. Insert data into the row with the partition key for the partition we wish to create, and rollback. This can be tricky, especially with tables containing many NOT NULL columns, and is subject to failure based upon table evolution.
  2. Lock the relevant partition in shared mode using the commandLOCK TABLE .x. PARTITION FOR <partition-key> IN SHARE MODE;

    This will place a shared lock on the non-existant partition, which Oracle will create so it can lock it. A much less messy solution, and not one I had thought of until shown the light by Dan Jankowski.

So does option 2 work? Here’s a quick example:

10:51:55 NEIL @ ORCL01 > CREATE TABLE interval_table (id NUMBER, created_date DATE)
10:51:55   2             PARTITION BY RANGE (created_date) INTERVAL (NUMTOYMINTERVAL(1,'MONTH'))
10:51:55   3           ( PARTITION part_01 values LESS THAN (TO_DATE('01-JAN-2015','DD-MON-YYYY')))
10:51:55   4  /
Table created.

10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA


use a shared lock to generate a new partition

10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-JAN-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.

10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P647                       TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-FEB-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-MAR-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-APR-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P647                       TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P648                       TO_DATE(' 2016-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P649                       TO_DATE(' 2016-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P650                       TO_DATE(' 2016-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

10:51:55 NEIL @ ORCL01 > -- and release the locks... the rollback doesn't rollback the creation of the new partitions.
10:51:55 NEIL @ ORCL01 > rollback;
Rollback complete.

So now what? To get the stats right I follow the following rule set:

Firstly, lock the table stats when you create the table and have a dedicated, focused stats job. This stop the general stats job from getting in the way of this this important partitioned table.

  • Create a new partition
  • Copy stats using DBMS_STATS.COPY_TABLE_STATS from your fullest partition (with FORCE=TRUE to override the locked stats option). Always try to pretend you have more data than you really have if faking it – it’s safer as an optimized plan with larger expected data sets processing a smaller data set tends to work much better than the stats assuming a small data set and being surprised by lots of data. Consider using SCALE_FACTOR if you have a growing dataset. Don’t reply upon Optimizer magic, such as Adaptive Execution Plans to dig out of a hole.
  • Once a partition becomes “old” (like a no-longer used date-range partition), gather actual stats to get all your stats for that partition accurate. That may even become your new baseline stats copy partition. You will possibly never need to gather stats again for that partition.
  • Be careful if you are copying stats when you have an object-level change. For example, if you put a new index on, you need to re-copy stats post change to any pre-created partitions. Empty pre-created partitions will have accurate (i.e. empty) stats and that’s really not what you want.
  • Repeat as often as you pre-create a new partition

Auditing Read-Only Standbys

If your company has a passing interest in database security, you are probably running some sort of session auditing at the very least [audit session] (although this can also be useful for troubleshooting connectivity issues). There’s a reasonable chance you’re also running some level of object auditing, or even DML access auditing if your employer so dictates:

Check access/change of objects in the DB
  1  select audit_option, success, failure
  2  from dba_stmt_audit_opts
  3  union
  4  select privilege, success, failure
  5  from dba_priv_audit_opts
  6* order by 2,1
AUDIT_OPTION                                       SUCCESS                                  FAILURE
-------------------------------------------------- ---------------------------------------- -----------------
ALTER ANY PROCEDURE                                BY ACCESS                                BY ACCESS
ALTER ANY TABLE                                    BY ACCESS                                BY ACCESS
ALTER DATABASE                                     BY ACCESS                                BY ACCESS
ALTER PROFILE                                      BY ACCESS                                BY ACCESS
ALTER SYSTEM                                       BY ACCESS                                BY ACCESS
ALTER TABLE                                        BY ACCESS                                BY ACCESS
ALTER USER                                         BY ACCESS                                BY ACCESS
AUDIT SYSTEM                                       BY ACCESS                                BY ACCESS
CREATE ANY JOB                                     BY ACCESS                                BY ACCESS
CREATE ANY LIBRARY                                 BY ACCESS                                BY ACCESS
CREATE ANY PROCEDURE                               BY ACCESS                                BY ACCESS
CREATE ANY TABLE                                   BY ACCESS                                BY ACCESS
CREATE EXTERNAL JOB                                BY ACCESS                                BY ACCESS
CREATE LIBRARY                                     BY ACCESS                                BY ACCESS
CREATE PROCEDURE                                   BY ACCESS                                BY ACCESS
CREATE PUBLIC DATABASE LINK                        BY ACCESS                                BY ACCESS
CREATE SESSION                                     BY ACCESS                                BY ACCESS
CREATE TABLE                                       BY ACCESS                                BY ACCESS
CREATE USER                                        BY ACCESS                                BY ACCESS
DATABASE LINK                                      BY ACCESS                                BY ACCESS
DIRECTORY                                          BY ACCESS                                BY ACCESS
DROP ANY PROCEDURE                                 BY ACCESS                                BY ACCESS
DROP ANY TABLE                                     BY ACCESS                                BY ACCESS
DROP PROFILE                                       BY ACCESS                                BY ACCESS
DROP USER                                          BY ACCESS                                BY ACCESS
EXEMPT ACCESS POLICY                               BY ACCESS                                BY ACCESS
GRANT ANY OBJECT PRIVILEGE                         BY ACCESS                                BY ACCESS
GRANT ANY PRIVILEGE                                BY ACCESS                                BY ACCESS
GRANT ANY ROLE                                     BY ACCESS                                BY ACCESS
PROFILE                                            BY ACCESS                                BY ACCESS
PUBLIC DATABASE LINK                               BY ACCESS                                BY ACCESS
PUBLIC SYNONYM                                     BY ACCESS                                BY ACCESS
ROLE                                               BY ACCESS                                BY ACCESS
SYSTEM AUDIT                                       BY ACCESS                                BY ACCESS
SYSTEM GRANT                                       BY ACCESS                                BY ACCESS
USER                                               BY ACCESS                                BY ACCESS
CREATE JOB                                         BY SESSION                               BY SESSION

And you’re probably writing it into a database table [AUDIT_TRAIL=’DB’]

so how does that work if you open a Dataguard database read only? You are writing into sys.aud$ on the Primary, and that table is replicated to the Standby. So what happens?

From Oracle 11G, if you are running Dataguard and opening the standby up for read access, you may not notice the line in the ALERT log which reads:

AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access

So it starts writing down O/S audit trail files for all of your auditing options (well, the session connect and DML audit options – you can’t run DDL in a r/o DB). You might want to go and see just how many files it has written to [audit_file_dest], as you may be surprised at just how many are in there.

You may, one day, either run out of space or (more worryingly) have so many millions of files that it causes a performance problem when Oracle access the O/S directory. You might want to think about some sort of periodic clean-up job.

SQL Plan Management – 12C dumb feature

In Oracle 11G, Oracle introduced SQL Plan Management (SPM). It is excellent (I love it to bits). It allows you to create Baselines against SQL which lock-down the SQL execution plan. No more plan flips. More consistency. Perfect**.

Whenever some Baselined SQL is ran, Oracle still parses it and compares the parsed output to the accepted (Evolved) baselines. If the newly parsed plan is better, a new baseline is added to DBA_SQL_PLAN_BASELINES but is NOT accepted. This means that you need to spend time manually accepting the baseline; running the command DBMS_SPM.EVOLVE_SQL_BASELINE plan and checking the new plan.

If you want it, and/or Oracle evaluates that is it a better plan for that particular set of bind variables, the plan is accepted and becomes a candidate to be used by future execution of your SQL. Complete control over your execution plans.

So, Oracle, what’s all this about in Oracle 12C, eh?

In Oracle 12C there’s a new SPM Evolve advisor task. “By default, SYS_AUTO_SPM_EVOLVE_TASK runs daily in the scheduled maintenance window” – So, it runs every night and by default it runs DBMS_SPM.EVOLVE_SQL_BASELINE for all new baselines created today and automatically accepts the new plans.

BY DEFAULT? NO! NO! NO!

That is precisely what I don’t want from baselines – Oracle making it’s own mind up about plans without any input from me. I’m using baselines to stop Oracle changing its mind. To explicitly limit the number of paths allowed by the Optimizer to ones I know about and with which I am comfortable. Don’t introduce functionality to do the opposite.

So, immediately following the installation of 12C, I would recommend running (you need to be SYS for this):

SELECT PARAMETER_NAME, PARAMETER_VALUE AS "VALUE"
FROM   DBA_ADVISOR_PARAMETERS
WHERE  TASK_NAME = 'SYS_AUTO_SPM_EVOLVE_TASK' AND
         PARAMETER_NAME in ('ACCEPT_PLANS','TIME_LIMIT')
ORDER BY 1;

PARAMETER_NAME            VALUE
------------------------- ----------
ACCEPT_PLANS              TRUE
TIME_LIMIT                3600

Then run:

BEGIN
  DBMS_SPM.SET_EVOLVE_TASK_PARAMETER('SYS_AUTO_SPM_EVOLVE_TASK',
    'ACCEPT_PLANS', 'false');
END;
/

OK, back where we were, with any baselines fixed in place and doing what I want them to do! Not change.

 

**Perfect? No. But Baselines are good and, as long as your DB structure does not change, they should keep working. If they don’t, raise an SR with Oracle as it’s probably a bug.

UPDATE 2015-11-25: This is still as true in 12.1.0.2.5 as it was in 12.0 Grrrr!!!

SAN Migration: When modern SANs “Fail”

… or at least, when they don’t behave like you were expecting them to.

I recently performed a major migration of a dozen missions critical Oracle and SQL Server systems from a pair of old EMC CX700’s to 2 brand new shiny HP XP 20000’s. This blog post is intended to show, at a very high level, what steps were taken to ensure the new SAN could cope with the volume of traffic. It also shows that, despite the new SAN being much more modern and intelligent, how you can still get caught out by unexpected behaviour.

The databases to be migrated varied from high throughput OLTP, low throughput – fast response time OLTP, a small Data Warehouse and Grid Control. The systems are 24 x 7 x 365, but a nightly maintenance window can be arranged on some days of the week between midnight and 3am. Not a large window, but invaluable.

What are we migrating?

For a successful SAN migration, we first need to understand what we are migrating. To do this we need to profile the I/O requirement of each Oracle database. Oracle provides us with all of the tools we need to be able to do this. If you have the Tuning and Diagnostics Pack licensed (which all Oracle sites should have – it should really be part of base product.) then you can pull the information you need directly out of the AWR table DBA_HIST_SYSTEM_EVENT. By default your AWR stats gather every hour and only last a week. Hourly averages of system event wait times will give a good starting point, but I would not want to use them to determine peak volumes. If you capacity plan to your averages, you will run out of resources before you hit your peaks. A lot can happen in an hour when you are looking at millisecond response times. You need to be much more granular than an hourly average, but you might not want change AWR to gather all of its information as frequently as you need for the I/O events. You need to develop some code to gather the information from v$system_event directly, so I did [I will post this code up shortly]. You can be as granular as you want, but I balanced peaks against volume of information and the impact of gathering it and was running every 5 minutes. Mostly. I also took some much more frequent samples to ensure that I wasn’t wildly wrong with the 5 minute averages, and that I wasn’t experiencing very short peaky loads. This method also has the advantage of not needing any additional pack licensing from Oracle. Sorry Larry.

What are the resources available on the new SAN?

The second requirement for a SAN migration is to understand the capabilities of your new SAN. There were already some systems on the new SAN so I was unable to test to destruction. However, some dedicated resources were available on the shared SAN, namely dedicated disks, cache and front-end ports. I needed to throw substantial workloads at the SAN to stress the resources and find the tipping point.  There are many workload tools available, such as Hammerora and RAT. Some are expensive, such as RAT. However, for my purposes I was only looking to stress the SAN and get a comfortable feeling about the IOPS and throughput performance. This was achieved using Oracle’s free SAN stress tool, Orion [I will hopefully post more details about this shortly]. By taking all of the profiled I/O rates, adding them up and comparing against the new SAN and it’s much greater cache capacity (16GB dedicated to my systems, up from 4GB on the EMC’s), we were able to see that the footprint of the databases to be migrated fitted comfortably into the capabilities of the new SAN. And so the migrations began.

Thin Provisioning

To provide resilience for the databases, LUNs were presented from each SAN to each cluster node, and were mirrored at the Operating System level, either using native mirroring on the Unix platforms, or Veritas on the Windows platforms. This allows either SAN (or SAN location) to fail with no loss of service. The LUNs were either RAID-10 or RAID-5, and were carved out of the storage pools using Thin Provisioning.

Thin Provisioning (or using Meta-Luns or similar techniques from your SAN provider) is a way to allow LUN’s to be easily carved out of storage pools and extended at a later date. They are wonderful for SAN administrators and people who have not capacity planned their systems due to their dynamic nature and minimization of space wastage. However, they work on the basis of providing a bit of storage to your LUN from every disk in the array. This gives a maximum concurrency, maximum contention model. Every LUN is on the same disk competing for the same resource. Lots of IOPS, but everyone wants them. The Thin Provisioning also adds a small overhead to the I/O processing. Not much. Not even a whole milli-second.

Migration Problem

The migrations progressed well, from least critical to most critical system. As each system was migrated, we kept monitoring and measuring the I/O response times. Migrating the Data Warehouse showed that the new SAN was performing as expected. Processing times were lower. Not much, but in line with expectation and allocated resources. However, Grid Control was exhibiting some strange response times. Despite theoretically having more resources, some I/O response time metrics were worse on average, and much more variable than before. With the EMC SAN’s, we  had experienced a good, consistent level of performance for each LUN. The XP’s were proving to be more erratic.

I did a lot of investigation, moving resources around (e.g. dedicated front-end ports for the Grid Control LUN’s.) and measuring the affect. The level of I/O associated with Grid Control, with only a couple of hundred servers under management, was substantially lower than that of systems competing for shared resources (e.g. the Data Warehouse, which was exponentially more demanding). It seemed that Grid Control’s LUNs were not able to survive in the SAN cache. The old EMC cx700’s were relatively “dumb” compared to the new XP20000’s, and had effectively dedicated an amount of  cache to each LUN. The more intelligent XP’s would not be so potentially wasteful of resources, and so deemed the (relatively) low level of activity from the Grid Controls LUN’s to be unimportant and aged the blocks out.

We could live with this for Grid Control. Blisteringly fast response time is not a critical factor in the success of Grid Control. The same could not be said for one of our mission critical applications. It requires a very low end-to-end response time for each transaction. We carefully measure each step – client, calculation, render, network latency, database response time, etc. It needs to be fast and consistent. From the database, it really needs to hit cache or SSD’s. Spinning rust (ordinary disk) is simply too slow. However, the level of activity for this system, in terms of MB/s and IOPS, is relatively low compared to all of the other systems competing for SAN resources. About 3% of the whole. Add to this the overhead of Thin Provisioning and we have a problem. When a 2ms average turns into a nearly 4ms average with much higher peaks, as it did for db_file_sequential_read on Grid Control, that would be a major problem for this application.

Solution

Talking to HP, they would only guarantee the response time we demand by using SSD’s and a cost that made my eyes spin and think of buying a Supercar instead. A more practical (i.e. cheaper) solution had to be found, and it was (with the help amd support of HP, just no guarantees 🙂 ). We bought some more conventional disk and configured it into a traditional Raid-10 storage pool. No Thin Provisioning. We then partitioned the 16GB SAN cache into a 4GB cache dedicated to this system, and left the 12GB cache behind for everything else. We were migrating from 4GB, and understood that we had plenty of wriggle room in this configuration.

And the result? After migration, the performance of the mission critical low throughput system improved substantially, with consistent log_file_parallel_write times of less than 2ms and db_file_sequential_read times of less than 1ms, better than we were experiencing on the old reliable EMC’s. I mean, less than 1ms? That’s fast.

I/O Response Times – Graph