Parallel Madness

usain-bolt-1I’ve noticed at a few clients with data warehouses recently that the Developers and, upon occasion, Business Users have a real fondness for hinting the SQL they are producing with one particular hint. PARALLEL.

As any fule kno, this IS the magic go-faster hint. PARALLEL(2) is obviously twice as fast as serial execution. PARALLEL(4) is amazing and PARALLEL(64) like Usain Bolt on Red Bull.

The problem is that, like all database features, parallel query comes with a cost.

When you specify /*+ PARALLEL(n) */ you are telling the optimizer that it has a lot more resources to use to complete this particular query. Not a single thread but many. PARALLEL(10) will use 21 processes to complete its execution – 20 Parallel Execution Server (10 producers, 10 consumers) and a coordinator (which is your connections shadow process) which will deal with any aspects of the parallel plan which cannot be parallelised.

Allowed free reign to use PARALLEL, devs and users will quickly consume all of the resources available on a given server, causing contentions which will inevitably slow down the overall execution of every piece of code executing on there. To illustrate this, I’d like to use an example I came across a while ago to show how excess PARALLEL of a single statement can be problematic itself.

Lets say I have a single server with 16 cores, lots of memory and a decent SSD array so the problem will centre around the CPU. Inevitably your 16 cores will be hyperthreaded. This then looks to Oracle like you have 32 cores. Whilst Oracle knows you have 16 hyperthreaded cores, you get CPU_COUNT=32

NOTE: 16 cores hyperthreaded DO NOT have the power of 32 cores, especially when dealing with databases. Some database workloads are actually WORSE with hyperthreading enabled (e.g. Data Warehouse systems on SQL Server). Inevitably the server admins will have enabled it unless you can provide cast-iron evidence to have it disabled.

I have a statement which the users are complaining about. It starts with the  following code: SELECT /*+ PARALLEL */ (complex multi-table join)

So what does this unrestricted (and therefore DEFAULT!) degree of parallelism (DOP) do in this case?
The default DOP is defined as PARALLEL_THREADS_PER_CPU x CPU_COUNT=2 x 32 = PARALLEL(64)
On RAC it is PARALLEL_THREADS_PER_CPU x CPU_COUNT x INSTANCE_COUNT!

Lets have a look at the ACTIVITY of this PARALLEL(64) query:

parallel_64_activity

You can see from the screenshot that Oracle knows there are 16 cores but it has gone PARALLEL(64), using 128 parallel exection slaves and fully expecting to have the available resources to run PARALLEL(64) efficiently. The execution plan is calculated around this assumption. There are 64 parallel execution slaves attempting to work on this at the same time. It’s worth looking at the metrics associated with this query.

Peaks of 2GB/s disk, 140GB of TEMP and 32 CPU’s.

parallel_64_metrics

The query took 36.9 minutes to complete.

I had the query changed to inject a modicum of realism into the available resources at the time of the run, and restricted the DOP to PARALLEL(8).

parallel_8_activity

Oracle is restricted to the limited amount of resource, which is availble. The execution plan is different, to reflect the lower amount of available resources. Looking at the metrics:

Peaks of 1GB/s, 3GB of TEMP and 12 CPU’s.

parallel_8_metrics

The query took 10.3 minutes to complete. 3 times quicker!

It is worth noting that testing the query in isolation with PARALLEL(16) took 7 minutes to complete, but that DOP would have resource-starved the server as a whole causing everything else currently executing to slow down, and was discounted as an option.

With PARALLEL, less can be better.
Using PARALLEL for everything can be counter-productive.
Co-ordinating PARALLEL across multiple RAC nodes can be disasterous without careful design to take advantage of this feature (using PARALLEL_LOCAL_FORCE=TRUE will restrict parallel processing to a single node). Oracle recommend you don’t do this. Other opinions are available and I generally recommend setting this to TRUE.

We have a limited amount of resources on our servers. Working within those resource limitations will provide substantial benefits.

Before using Parallel processing, I’d recommend thoroughly reading the VLDB and Partitioning Guide appropriate to your database release!

If you use PARALLEL:

  • Use Resource Manager to restrict DOP appropriately for different groups of users.
  • Consider setting PARALLEL_LOCAL_FORCE=TRUE
  • Consider setting PARALLEL_THREAD_PER_CPU=1, especially where you have hyperthreading enabled.
  • Consider your overall workload, not just the SQL you are working with right now.

 

Where’s my Oracle SQL Plan Baseline?

saeed-mhmdi-130222-unsplashNo so long ago I was having fun creating SQL Plan Baselines in a old 11.2.0.3 database due to be decomissioned but which needs to keep running for a while (no doubt several years) – so minimal time/money to be expended on it. Then, one day, I couldn’t create a baseline and needed to figure out why…

We get the occasional painful plan change, so pinning down an acceptable historic plan using a Baseline is becoming a regular occurrence. The standard route for this is:

1. Notice plan isn’t good and that we have been running with a good plan historically
2. Identify an plan hash value which is acceptable
3. Load it from the library cache if that plan is in there (unlikely)

[ Kerry Osborne has a useful blog post/script for this so I won’t reproduce here: http://kerryosborne.oracle-guy.com/2012/03/displaying-sql-baseline-plans/ ]

4. If not, create a SQL Tuning Set from AWR with that specific plan and convert the SQL Tuning Set into a Baseline. You will need the begin-and-end snap id’s containing the plan, the sql_id and the plan_hash_value for the specific plan you want to baseline:

Please ensure you have suitable licensing before running any of the following code!

declare
 baseline_ref_cur DBMS_SQLTUNE.SQLSET_CURSOR;
begin
 DBMS_SQLTUNE.CREATE_SQLSET('NEIL');
 open baseline_ref_cur for
 select VALUE(p) from table(
 DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(&begin_snap_id, &end_snap_id,'sql_id='||CHR(39)||'4jcxvz3adqbs2'||CHR(39)||'and plan_hash_value=1905606778',NULL,NULL,NULL,NULL,NULL,NULL,'ALL')) p;
 DBMS_SQLTUNE.LOAD_SQLSET('NEIL', baseline_ref_cur);
end;
/

Did we get it?

select * from dba_sqlset_statements where sqlset_name = 'NEIL';

SQLSET_NAME, SQLSET_OWNER, SQLSET_ID, SQL_ID,       FORCE_MATCHING_SIGNATURE, SQL_TEXT,                                   PARSING_SCHEMA_NAME, PLAN_HASH_VALUE ...
----------------------------------------------------------------------------------------------------------------------------------------------------------
NEIL         NEIL_DBA             35  4jcxvz3adqbs2 6134983393191283611       INSERT WHEN CNT = 1 THEN INTO SOME_TABLE...  APP                  1905606778      ...

Cool! I have captured the SQL. Lets create the baseline using: dbms_spm.load_plans_from_sqlset

declare
cnt number;
begin
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
end;
/
... and I get ...

Plans Loaded : 0

Zero plans? So what plans are in there? Has my plan actually appeared?

select SQL_HANDLE,PLAN_NAME,CREATOR,ORIGIN,PARSING_SCHEMA_NAME, ENABLED,ACCEPTED,REPRODUCED,CREATED
  from dba_sql_plan_baselines where origin like 'MANUAL-LOAD%' order by created desc;

SQL_HANDLE           PLAN_NAME                      CREATOR   ORIGIN      PARSING_SCHEMA_NAME  ENABLED  ACCEPTED  REPRODUCED  CREATED
-----------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_23829f7bf3712438 SQL_PLAN_270nzggtr291s1a6ce30c NEIL_DBA  MANUAL-LOAD APP                  YES      YES       YES         2018-12-31 00.00.00.000000000

No! But how do I prove that? Baselines don’t have the SQL ID associated with them?

Here’s some code from [blog post here: http://oracleprof.blogspot.com/2011/07/how-to-find-sqlid-and-planhashvalue-in.html ] which will take the SQL_Handle and display the SQL ID and Plan Hash Value (you may need an explicit grant to use DBMS_CRYPTO in your PDB for this to work):

For SQL Handle: SQL_23829f7bf3712438 

declare
v_sqlid VARCHAR2(13);
v_num number;
BEGIN
 dbms_output.put_line('SQL_ID '||' '|| 'PLAN_HASH_VALUE' || ' ' || 'SQL_HANDLE ' || ' ' || 'PLAN_NAME');
 dbms_output.put_line('-------------'||' '|| '---------------' || ' ' || '------------------------------' || ' ' || '--------------------------------');
 for a in (select sql_handle, plan_name, trim(substr(g.PLAN_TABLE_OUTPUT,instr(g.PLAN_TABLE_OUTPUT,':')+1)) plan_hash_value, sql_text
from (select t.*, c.sql_handle, c.plan_name, c.sql_text from dba_sql_plan_baselines c, table(dbms_xplan.DISPLAY_SQL_PLAN_BASELINE(c.sql_handle, c.plan_name)) t
where c.sql_handle = '&sql_handle') g
where PLAN_TABLE_OUTPUT like 'Plan hash value%') loop
  v_num := to_number(sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),9,4)) || sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),13,4)),RPAD('x', 16, 'x'));
  v_sqlid := '';
  FOR i IN 0 .. FLOOR(LN(v_num) / LN(32))
  LOOP
   v_sqlid := SUBSTR('0123456789abcdfghjkmnpqrstuvwxyz',FLOOR(MOD(v_num / POWER(32, i), 32)) + 1,1) || v_sqlid;
  END LOOP;
 dbms_output.put_line(v_sqlid ||' ' || rpad(a.plan_hash_value,15) || ' ' || rpad(a.sql_handle,30) || ' ' || rpad(a.plan_name,30));
end loop;
end;
/

SQL_ID        PLAN_HASH_VALUE SQL_HANDLE                     PLAN_NAME
------------- --------------- ------------------------------ --------------------------------
7f4n59twq8mrj 3036703685      SQL_23829f7bf3712438           SQL_PLAN_270nzggtr291s1a6ce30c 

 

That baseline is definitely not mine. It’s for the wrong SQL ID!
The “Plans Loaded” message was correct when it said “0“!

Why? There was no error message, no output other than the number of plans loaded. That sucks dbms_spm!

I need to trace it. Do that by using dbms_spm.configure

(thank you Timur Akhmadeev for helping me! It’s not easy to google/MOS how to do this, hence this post so I don’t forget again!)

declare
cnt number;
begin
 dbms_spm.configure('spm_tracing',1);
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
 dbms_spm.configure('spm_tracing',0);
end;
/

Looking in the trace file, it tells you what went wrong:

*** 2019-01-01 12:00:00.007
load sts: STS=NEIL, owner=NEIL_DBA
load sts: cursor opened
load sts: sql_id=4jcxvz3adqbs2 phv=1905606778
load sts: plan has empty outline, skipping it
load sts: plans total=0 plans loaded=0

So why does my plan have an empty outline?

The SQL_TEXT began “INSERT WHEN CNT = 1 THEN INTO SOME_TABLE… “.

It’s a multi-table insert, inserting into different tables depending upon a condition.
Baselines for Multi-table Inserts are NOT supported by SPM.

https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9014.htm#SQLRF01604

I hope this helps you with your baseline creation troubleshooting!

Having baseline creation problems, you should check MOS article 789520.1

 

Oracle SQL Monitor not monitoring my SQL

I needed to monitor a SQL statement in 11.2.0.3 (the limits mentioned below are the same in 12.1, 12.2, 18.4 and 19C) to determine what is was doing and why it was slow.sql_monitor

Usually I would use SQL Monitor [NOTE: You need to license the Oracle Tuning Pack to use SQL Monitor] for this but the SQL was not appearing in there, despite running for over 5 seconds, and being a parallel SQL (both of which qualify to be included in SQL Monitor). So I asked Twitter why, and thought I’d share the output here.

https://twitter.com/ChandlerDBA/status/1075692070952677376

It was nailed immediately by Jonathan Lewis, with added help from Ivica Arsov. (thank you!)

There is a hidden parameter “_sqlmon_max_planlines” which states that any SQL with a plan in excess of 300 lines should not be monitored (see below for SQLMon hidden parameters – and change them at your own risk, preferably with the backing of an SR from Oracle Support). This execution plan had well over 300 lines. The solution is to change either the session or the system to allow monitoring to happen when the plan is over 300 lines.

e.g.

alter system  set "_sqlmon_max_planlines"=500 scope=memory sid='*';
or
alter session set "_sqlmon_max_planlines"=500;

The negative side effect it that the monitoring will use more resources (primarily memory and CPU), which is why there are default limits on this feature. You might want to change it back when you’re finished to conserve resources.

Note that if you change the system parameter whilst the SQL is running, it will start to monitor the SQL at that point, so you will only get a partial picture of what is taking place, which is less valuable.

select ksppinm, ksppstvl, ksppdesc
  from sys.x$ksppi a, sys.x$ksppsv b
 where a.indx=b.indx
  and lower(ksppinm) like lower('%sqlmon%')
order by ksppinm;

KSPPINM                   KSPPSTVL  KSPPDESC
------------------------- --------- --------------------------------------------------------------------------------
_sqlmon_binds_xml_format  default   format of column binds_xml in [G]V$SQL_MONITOR
_sqlmon_max_plan          640       Maximum number of plans entry that can be monitored. Defaults to 20 per CPU
_sqlmon_max_planlines     300       Number of plan lines beyond which a plan cannot be monitored
_sqlmon_recycle_time      60        Minimum time (in s) to wait before a plan entry can be recycled
_sqlmon_threshold         5         CPU/IO time threshold before a statement is monitored. 0 is disabled

You may also notice a few other parameters in there. The “_sqlmon_recycle_time” hows the amount of time that the SQLMon plan will be guaranteed to be retained. Any retention time after that will be a bonus and depend upon the amount of SQL needing to be monitored. I see monitoring plans disappearing after 2-3 minutes in some systems, so you need to be quick, and you should save the plans down to disk.

save_sqlmon

The mad thing is that I was aware of this restriction before I posted by request for help on Twitter but I’d completely forgotten about it. So here’s the blog post to help me remember!

Stats Collection Time Anomaly

Johnathan Lewis (@JLOracle) recently published a short post about Stats Collection Time, talking about the table dba_optstat_operation (and dba_optstat_operation_tasks ), which reminded me about (what I regard as) an anomaly in the output in the NOTES columns in Oracle 12C.

I won’t repeat why it’s useful to check these tables as Johnathans note and @MDWidlakes’s comment here should give you all you need to know.

The DBA_OPTSTAT_OPERATION.NOTES column contains the parameters passed into the DBMS_STATS command, so you know what was done. It also reports the DEFAULT used by the DBMS_STATS job. Well, it does if you call DBMS_STATS explicitly, but the standard overnight auto job just says “default”. Why doesn’t is expand on that the way the explicit call does? If the default was changed between runs, you may end up with very different results but with no indication why. Am I missing something?

The following 2 rows of data show the output from each run. Note that the DEFAULT for METHOD_OPT in this database has been changed from “FOR ALL COLUMNS SIZE AUTO” to “FOR ALL COLUMNS SIZE REPEAT”** but was not explicitly passed-in for either run.

DBMS_STATS.GATHER_SCHEMA_STATS – decodes the DEFAULTs

OPERATION : gather_schema_stats            
TARGET    : MYSCHEMA     
START_TIME: 15-SEP-16 07.04.47 
END_TIME  : 15-SEP-16 07.09.02 
STATUS    : COMPLETED                                   
JOB_NAME  : 
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="NULL"/>
            <param name="estimate_percent" val="DBMS_STATS.AUTO_SAMPLE_SIZE"/>
            <param name="force" val="FALSE"/>
            <param name="gather_fixed" val="FALSE"/>
            <param name="gather_temp" val="FALSE"/>
            <param name="granularity" val="AUTO"/>
            <param name="method_opt" val="FOR ALL COLUMNS SIZE REPEAT"/>
            <param name="no_invalidate" val="NULL"/>
            <param name="options" val="GATHER"/>
            <param name="ownname" val="MYSCHEMA"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="statid" val=""/>
            <param name="statown" val=""/>
            <param name="stattab" val=""/>
            <param name="stattype" val="DATA"/>
            </params>

Autotask Overnight Gather – doesn’t decode the DEFAULTs

OPERATION : gather_database_stats (auto)   
TARGET    : AUTO       
START_TIME: 15-SEP-16 22.01.20 
END_TIME  : 15-SEP-16 22.38.40 
STATUS    : COMPLETED            
JOB_NAME  : ORA$AT_OS_OPT_SY_1212  
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="DEFAULT_DEGREE_VALUE"/>
            <param name="estimate_percent" val="DEFAULT_ESTIMATE_PERCENT"/>
            <param name="granularity" val="DEFAULT_GRANULARITY"/>
            <param name="method_opt" val="DEFAULT_METHOD_OPT"/>
            <param name="no_invalidate" val="DBMS_STATS.AUTO_INVALIDATE"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="stattype" val="DATA"/>
            </params>

 

**You should control the histograms that you need to maintain your schema stats as you would like Oracle to see them. The Oracle default approach of lots of histograms can be costly to maintain and store, and any stats which use Adaptive Sampling (all prior to 12C, and Height-Balanced / Hybrid in 12C onwards) carry a risk, especially in OLTP systems.
FOR ALL COLUMNS SIZE REPEAT was useful until Oracle12, when a change in the algorithm makes it dangerous, and it should NOT be used.  You should be using GLOBAL_STATS_PREFS and TABLE_STATS_PREFS to control your stats.

 

Accessing STATUS columns efficiently

A frequently reoccuring design problem with relational databases is the issue locating unprocessed rows in a large table, so we know which rows of data are still yet to be processed.

The problem with a STATUS column is that it generally has low cardinality; there are probably only a handful of distinct values [(C)omplete, (E)rror, (U)nprocessed or something like that]. Most records will be (C)omplete. This makes STATUS a poor candidate for standard B-Tree indexation. In a high throughput OLTP database, using bitmap indexes is probably not an option due to concurrency.

[Aside: When coding flag columns in Oracle, ALWAYS use a VARCHAR2(1 CHAR) {or CHAR(1 CHAR) if you prefer, but a CHAR is a VARCHAR2 under the covers and occupies the same number of bytes}. This is in preferance to a NUMBER(1). which occupies more bytes for a “1” than a “0”, so when you update it, you run the risk of row migration, chained rows and a performance hit. Frequently, ORM’s like Hibernate code for NUMBER by default. Override this!]

So what are my options? There’s a short list of possible table accesses for a low cardinality column.

1. Table scan. In an OLTP database where you only want a tiny fraction of the rows in the table, this would be a bad chouce.
2. Index the accessed columns and accept the inevitable INDEX_SCAN or FAST_FULL_INDEX_SCAN. This is not great and you probably need a Histogram on the column to convince the optimizer to use the index for your low frequency values. Otherwise you may be back to the table scan.
3. Make the “Complete” status “NULL”.
4. Uses a function-based index which makes the Complete status seems to be NULL for a specific query.

So what’s with options 3 and 4, why are they good, and how do we use them?

Unlike some RBDMS’s, Oracle does not store NULL values in it’s simple (non-composite) b-tree indexes. Therefore, if you choose Option (3) and make your “Complete” status be represented by a NULL, you will maintain an index on STATUS in which the only values that are stored are values you are interested in. This makes the index very sexy to the optimizer as it will generally be very tiny. However, we face one small problem. Convincing Developers that having a NULL as a valid status can be difficult. A NULL is a non-representative value. It is not supposed to represent anything. It means “I don’t know”. It doesn’t behave the same an normal values. This tends to freak out Developers and designers sometimes.

That’s where Option 4 comes in. If we wrap the index definition in a CASE statement, to produce a function-based index, we have have a highly specific tailored index on our table. If the SQL predicate matches the query exactly, we get a serious performance payoff.

But don’t take my word for it. Here’s a worked example from my laptop:

 
Here’s the table, it’s data distribution (16m rows, and a handful we care about)

NEIL @ ORCL01 > desc test_table
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                            NOT NULL NUMBER
 STATUS                        NOT NULL VARCHAR2(1 CHAR)
 DESCRIPTION                   NOT NULL VARCHAR2(100 CHAR)

NEIL @ ORCL01 > select status,count(*) from test_table group by status

S   COUNT(*)
- ----------
E         16
C   16777216
Y         32

 
Here are the indexes on the table, and their sizes. As you can see, the function-based index is absolutely tiny, making it as attractive to storage admins as it is to the optimizer.

- alter table test_table add constraint test_table_pk primary key (id);
- create index test_table_CASE on test_table (case status when 'Y' then status else null end);
- create index test_table_COVER_COMP on test_table (status, id) compress 1;
- create index test_table_STATUS on test_table (status) compress 1;



NEIL @ ORCL01 > select segment_name,segment_type,sum(bytes/1024) kb from user_extents 
where segment_name like 'TEST_TABLE%' 
group by segment_type,segment_name order by 2 desc,1;

SEGMENT_NAME               SEGMENT_TYPE               KB
-------------------------- ------------------ ----------
TEST_TABLE                 TABLE                  555008
TEST_TABLE_CASE            INDEX                      64
TEST_TABLE_COVER_COMP      INDEX                  658432
TEST_TABLE_PK              INDEX                  319488
TEST_TABLE_STATUS          INDEX                  413696

Some Index stats:
INDEX_NAME                DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTOR STATUS     NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------- ------------- ----------------------- ----------------------- ----------------- -------- ---------- ----------- ---------
TEST_TABLE_CASE                       1                       1                       6                 6 VALID            32          32 21-FEB-16
TEST_TABLE_COVER_COMP          16748149                       1                       1            125447 VALID      16748149      234974 21-FEB-16
TEST_TABLE_PK                  17003239                       1                       1             91391 VALID      17003239      492287 21-FEB-16
TEST_TABLE_STATUS                     3                   13828                   32011             96034 VALID      16257590      363295 21-FEB-16

 
Where we have a choice of useful indexes, we get a FAST FULL SCAN with a hefty cost. A histogram could have given us an index RANGE SCAN, which can be very good.
With no Histogram:

select id from test_table where status = 'Y';

Plan hash value: 1140618830

----------------------------------------------------------------------------------------------
| Id  | Operation            | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                       |       |       | 18753 (100)|          |
|*  1 |  INDEX FAST FULL SCAN| TEST_TABLE_COVER_COMP |  5592K|    42M| 18753   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

 
With a histogram in place on STATUS, you get a much better plan as the covering index avoids the need for the table look-up. You also get the risk that the optimizer may have bind variable peeking issues and other complications should we have lots of table joins.

select id from test_table where status = 'Y'

Plan hash value: 2912582684

------------------------------------------------------------------------------------------
| Id  | Operation        | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                       |       |       |     3 (100)|          |
|*  1 |  INDEX RANGE SCAN| TEST_TABLE_COVER_COMP |    32 |   256 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

NOTE: Ditching the covering index and just using the index on STATUS is pretty efficient too when combined with a histogram:

select id from test_table where status = 'Y'

Plan hash value: 2416598805

---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE        |    32 |   256 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_STATUS |    32 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


 
And now with the function-based index; having the case statement removing all statuses we are not interested-in for a tiny tidy index.

NOTE: The Predicate in the query must EXACTLY match the function-based index for it to be used.

select id from test_table where case status when 'Y' then status else null end = 'Y'

Plan hash value: 2073004851

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                 |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE      |    32 |   256 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_CASE |    32 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Conclusion: For a highly skewed STATUS column you need a histogram, which is something you should mostly avoid in OLTP systems using BIND variables. Having a highly focussed function-based index allows for a tiny self-maintaining index which is guaranteed to only be used for queries that you want it to be used for.
 

NOTE: The original idea behind using NULLS to minimise index size came from the performance expert, Jonathan Lewis. I have implemented both NULL-as-complete design and case-based indexes at several clients, in varying forms, and always to great success.

Pre-creating Interval Partitions

One of the major problems with interval-based partitioning is the statistics. Partitions appear dynamically as-needed based upon data being inserted or udpated, and any partition magically appearing isn’t going to have any statistics.

This is generally a stability issue as you will, at best, be using dynamic statistics for your optimizations. So how do we deal with it? My preferred method is to pre-create the partitions and copy statistics from a good partition into the pre-created partition. But how do we get the partitions to appear? Here’s 2 options:

  1. Insert data into the row with the partition key for the partition we wish to create, and rollback. This can be tricky, especially with tables containing many NOT NULL columns, and is subject to failure based upon table evolution.
  2. Lock the relevant partition in shared mode using the commandLOCK TABLE .x. PARTITION FOR <partition-key> IN SHARE MODE;

    This will place a shared lock on the non-existant partition, which Oracle will create so it can lock it. A much less messy solution, and not one I had thought of until shown the light by Dan Jankowski.

So does option 2 work? Here’s a quick example:

10:51:55 NEIL @ ORCL01 > CREATE TABLE interval_table (id NUMBER, created_date DATE)
10:51:55   2             PARTITION BY RANGE (created_date) INTERVAL (NUMTOYMINTERVAL(1,'MONTH'))
10:51:55   3           ( PARTITION part_01 values LESS THAN (TO_DATE('01-JAN-2015','DD-MON-YYYY')))
10:51:55   4  /
Table created.

10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA


use a shared lock to generate a new partition

10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-JAN-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.

10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P647                       TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-FEB-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-MAR-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-APR-2016','DD-MON-YYYY')) in share mode;
Table(s) Locked.
10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2;
TABLE_NAME                     PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------ --------------------------------------------------------------------------------
INTERVAL_TABLE                 PART_01                        TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P647                       TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P648                       TO_DATE(' 2016-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P649                       TO_DATE(' 2016-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
INTERVAL_TABLE                 SYS_P650                       TO_DATE(' 2016-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

10:51:55 NEIL @ ORCL01 > -- and release the locks... the rollback doesn't rollback the creation of the new partitions.
10:51:55 NEIL @ ORCL01 > rollback;
Rollback complete.

So now what? To get the stats right I follow the following rule set:

Firstly, lock the table stats when you create the table and have a dedicated, focused stats job. This stop the general stats job from getting in the way of this this important partitioned table.

  • Create a new partition
  • Copy stats using DBMS_STATS.COPY_TABLE_STATS from your fullest partition (with FORCE=TRUE to override the locked stats option). Always try to pretend you have more data than you really have if faking it – it’s safer as an optimized plan with larger expected data sets processing a smaller data set tends to work much better than the stats assuming a small data set and being surprised by lots of data. Consider using SCALE_FACTOR if you have a growing dataset. Don’t reply upon Optimizer magic, such as Adaptive Execution Plans to dig out of a hole.
  • Once a partition becomes “old” (like a no-longer used date-range partition), gather actual stats to get all your stats for that partition accurate. That may even become your new baseline stats copy partition. You will possibly never need to gather stats again for that partition.
  • Be careful if you are copying stats when you have an object-level change. For example, if you put a new index on, you need to re-copy stats post change to any pre-created partitions. Empty pre-created partitions will have accurate (i.e. empty) stats and that’s really not what you want.
  • Repeat as often as you pre-create a new partition

Auditing Read-Only Standbys

If your company has a passing interest in database security, you are probably running some sort of session auditing at the very least [audit session] (although this can also be useful for troubleshooting connectivity issues). There’s a reasonable chance you’re also running some level of object auditing, or even DML access auditing if your employer so dictates:

Check access/change of objects in the DB
  1  select audit_option, success, failure
  2  from dba_stmt_audit_opts
  3  union
  4  select privilege, success, failure
  5  from dba_priv_audit_opts
  6* order by 2,1
AUDIT_OPTION                                       SUCCESS                                  FAILURE
-------------------------------------------------- ---------------------------------------- -----------------
ALTER ANY PROCEDURE                                BY ACCESS                                BY ACCESS
ALTER ANY TABLE                                    BY ACCESS                                BY ACCESS
ALTER DATABASE                                     BY ACCESS                                BY ACCESS
ALTER PROFILE                                      BY ACCESS                                BY ACCESS
ALTER SYSTEM                                       BY ACCESS                                BY ACCESS
ALTER TABLE                                        BY ACCESS                                BY ACCESS
ALTER USER                                         BY ACCESS                                BY ACCESS
AUDIT SYSTEM                                       BY ACCESS                                BY ACCESS
CREATE ANY JOB                                     BY ACCESS                                BY ACCESS
CREATE ANY LIBRARY                                 BY ACCESS                                BY ACCESS
CREATE ANY PROCEDURE                               BY ACCESS                                BY ACCESS
CREATE ANY TABLE                                   BY ACCESS                                BY ACCESS
CREATE EXTERNAL JOB                                BY ACCESS                                BY ACCESS
CREATE LIBRARY                                     BY ACCESS                                BY ACCESS
CREATE PROCEDURE                                   BY ACCESS                                BY ACCESS
CREATE PUBLIC DATABASE LINK                        BY ACCESS                                BY ACCESS
CREATE SESSION                                     BY ACCESS                                BY ACCESS
CREATE TABLE                                       BY ACCESS                                BY ACCESS
CREATE USER                                        BY ACCESS                                BY ACCESS
DATABASE LINK                                      BY ACCESS                                BY ACCESS
DIRECTORY                                          BY ACCESS                                BY ACCESS
DROP ANY PROCEDURE                                 BY ACCESS                                BY ACCESS
DROP ANY TABLE                                     BY ACCESS                                BY ACCESS
DROP PROFILE                                       BY ACCESS                                BY ACCESS
DROP USER                                          BY ACCESS                                BY ACCESS
EXEMPT ACCESS POLICY                               BY ACCESS                                BY ACCESS
GRANT ANY OBJECT PRIVILEGE                         BY ACCESS                                BY ACCESS
GRANT ANY PRIVILEGE                                BY ACCESS                                BY ACCESS
GRANT ANY ROLE                                     BY ACCESS                                BY ACCESS
PROFILE                                            BY ACCESS                                BY ACCESS
PUBLIC DATABASE LINK                               BY ACCESS                                BY ACCESS
PUBLIC SYNONYM                                     BY ACCESS                                BY ACCESS
ROLE                                               BY ACCESS                                BY ACCESS
SYSTEM AUDIT                                       BY ACCESS                                BY ACCESS
SYSTEM GRANT                                       BY ACCESS                                BY ACCESS
USER                                               BY ACCESS                                BY ACCESS
CREATE JOB                                         BY SESSION                               BY SESSION

And you’re probably writing it into a database table [AUDIT_TRAIL=’DB’]

so how does that work if you open a Dataguard database read only? You are writing into sys.aud$ on the Primary, and that table is replicated to the Standby. So what happens?

From Oracle 11G, if you are running Dataguard and opening the standby up for read access, you may not notice the line in the ALERT log which reads:

AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access

So it starts writing down O/S audit trail files for all of your auditing options (well, the session connect and DML audit options – you can’t run DDL in a r/o DB). You might want to go and see just how many files it has written to [audit_file_dest], as you may be surprised at just how many are in there.

You may, one day, either run out of space or (more worryingly) have so many millions of files that it causes a performance problem when Oracle access the O/S directory. You might want to think about some sort of periodic clean-up job.

%d bloggers like this: