Where’s my Oracle SQL Plan Baseline?

saeed-mhmdi-130222-unsplashNo so long ago I was having fun creating SQL Plan Baselines in a old 11.2.0.3 database due to be decomissioned but which needs to keep running for a while (no doubt several years) – so minimal time/money to be expended on it. Then, one day, I couldn’t create a baseline and needed to figure out why…

We get the occasional painful plan change, so pinning down an acceptable historic plan using a Baseline is becoming a regular occurrence. The standard route for this is:

1. Notice plan isn’t good and that we have been running with a good plan historically
2. Identify an plan hash value which is acceptable
3. Load it from the library cache if that plan is in there (unlikely)

[ Kerry Osborne has a useful blog post/script for this so I won’t reproduce here: http://kerryosborne.oracle-guy.com/2012/03/displaying-sql-baseline-plans/ ]

4. If not, create a SQL Tuning Set from AWR with that specific plan and convert the SQL Tuning Set into a Baseline. You will need the begin-and-end snap id’s containing the plan, the sql_id and the plan_hash_value for the specific plan you want to baseline:

Please ensure you have suitable licensing before running any of the following code!

declare
 baseline_ref_cur DBMS_SQLTUNE.SQLSET_CURSOR;
begin
 DBMS_SQLTUNE.CREATE_SQLSET('NEIL');
 open baseline_ref_cur for
 select VALUE(p) from table(
 DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(&begin_snap_id, &end_snap_id,'sql_id='||CHR(39)||'4jcxvz3adqbs2'||CHR(39)||'and plan_hash_value=1905606778',NULL,NULL,NULL,NULL,NULL,NULL,'ALL')) p;
 DBMS_SQLTUNE.LOAD_SQLSET('NEIL', baseline_ref_cur);
end;
/

Did we get it?

select * from dba_sqlset_statements where sqlset_name = 'NEIL';

SQLSET_NAME, SQLSET_OWNER, SQLSET_ID, SQL_ID,       FORCE_MATCHING_SIGNATURE, SQL_TEXT,                                   PARSING_SCHEMA_NAME, PLAN_HASH_VALUE ...
----------------------------------------------------------------------------------------------------------------------------------------------------------
NEIL         NEIL_DBA             35  4jcxvz3adqbs2 6134983393191283611       INSERT WHEN CNT = 1 THEN INTO SOME_TABLE...  APP                  1905606778      ...

Cool! I have captured the SQL. Lets create the baseline using: dbms_spm.load_plans_from_sqlset

declare
cnt number;
begin
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
end;
/
... and I get ...

Plans Loaded : 0

Zero plans? So what plans are in there? Has my plan actually appeared?

select SQL_HANDLE,PLAN_NAME,CREATOR,ORIGIN,PARSING_SCHEMA_NAME, ENABLED,ACCEPTED,REPRODUCED,CREATED
  from dba_sql_plan_baselines where origin like 'MANUAL-LOAD%' order by created desc;

SQL_HANDLE           PLAN_NAME                      CREATOR   ORIGIN      PARSING_SCHEMA_NAME  ENABLED  ACCEPTED  REPRODUCED  CREATED
-----------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_23829f7bf3712438 SQL_PLAN_270nzggtr291s1a6ce30c NEIL_DBA  MANUAL-LOAD APP                  YES      YES       YES         2018-12-31 00.00.00.000000000

No! But how do I prove that? Baselines don’t have the SQL ID associated with them?

Here’s some code from [blog post here: http://oracleprof.blogspot.com/2011/07/how-to-find-sqlid-and-planhashvalue-in.html ] which will take the SQL_Handle and display the SQL ID and Plan Hash Value (you may need an explicit grant to use DBMS_CRYPTO in your PDB for this to work):

For SQL Handle: SQL_23829f7bf3712438 

declare
v_sqlid VARCHAR2(13);
v_num number;
BEGIN
 dbms_output.put_line('SQL_ID '||' '|| 'PLAN_HASH_VALUE' || ' ' || 'SQL_HANDLE ' || ' ' || 'PLAN_NAME');
 dbms_output.put_line('-------------'||' '|| '---------------' || ' ' || '------------------------------' || ' ' || '--------------------------------');
 for a in (select sql_handle, plan_name, trim(substr(g.PLAN_TABLE_OUTPUT,instr(g.PLAN_TABLE_OUTPUT,':')+1)) plan_hash_value, sql_text
from (select t.*, c.sql_handle, c.plan_name, c.sql_text from dba_sql_plan_baselines c, table(dbms_xplan.DISPLAY_SQL_PLAN_BASELINE(c.sql_handle, c.plan_name)) t
where c.sql_handle = '&sql_handle') g
where PLAN_TABLE_OUTPUT like 'Plan hash value%') loop
  v_num := to_number(sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),9,4)) || sys.UTL_RAW.reverse(sys.UTL_RAW.SUBSTR(dbms_crypto.hash(src => UTL_I18N.string_to_raw(a.sql_text || chr(0),'AL32UTF8'), typ => 2),13,4)),RPAD('x', 16, 'x'));
  v_sqlid := '';
  FOR i IN 0 .. FLOOR(LN(v_num) / LN(32))
  LOOP
   v_sqlid := SUBSTR('0123456789abcdfghjkmnpqrstuvwxyz',FLOOR(MOD(v_num / POWER(32, i), 32)) + 1,1) || v_sqlid;
  END LOOP;
 dbms_output.put_line(v_sqlid ||' ' || rpad(a.plan_hash_value,15) || ' ' || rpad(a.sql_handle,30) || ' ' || rpad(a.plan_name,30));
end loop;
end;
/

SQL_ID        PLAN_HASH_VALUE SQL_HANDLE                     PLAN_NAME
------------- --------------- ------------------------------ --------------------------------
7f4n59twq8mrj 3036703685      SQL_23829f7bf3712438           SQL_PLAN_270nzggtr291s1a6ce30c 

 

That baseline is definitely not mine. It’s for the wrong SQL ID!
The “Plans Loaded” message was correct when it said “0“!

Why? There was no error message, no output other than the number of plans loaded. That sucks dbms_spm!

I need to trace it. Do that by using dbms_spm.configure

(thank you Timur Akhmadeev for helping me! It’s not easy to google/MOS how to do this, hence this post so I don’t forget again!)

declare
cnt number;
begin
 dbms_spm.configure('spm_tracing',1);
 cnt := dbms_spm.load_plans_from_sqlset(sqlset_name=>'NEIL',SQLSET_OWNER=>'NEIL_DBA');
 dbms_output.put_line('Plans Loaded : '||to_char(cnt));
 dbms_spm.configure('spm_tracing',0);
end;
/

Looking in the trace file, it tells you what went wrong:

*** 2019-01-01 12:00:00.007
load sts: STS=NEIL, owner=NEIL_DBA
load sts: cursor opened
load sts: sql_id=4jcxvz3adqbs2 phv=1905606778
load sts: plan has empty outline, skipping it
load sts: plans total=0 plans loaded=0

So why does my plan have an empty outline?

The SQL_TEXT began “INSERT WHEN CNT = 1 THEN INTO SOME_TABLE… “.

It’s a multi-table insert, inserting into different tables depending upon a condition.
Baselines for Multi-table Inserts are NOT supported by SPM.

https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9014.htm#SQLRF01604

I hope this helps you with your baseline creation troubleshooting!

Having baseline creation problems, you should check MOS article 789520.1

 

Advertisement

SAN Migration: When modern SANs “Fail”

… or at least, when they don’t behave like you were expecting them to.

I recently performed a major migration of a dozen missions critical Oracle and SQL Server systems from a pair of old EMC CX700’s to 2 brand new shiny HP XP 20000’s. This blog post is intended to show, at a very high level, what steps were taken to ensure the new SAN could cope with the volume of traffic. It also shows that, despite the new SAN being much more modern and intelligent, how you can still get caught out by unexpected behaviour.

The databases to be migrated varied from high throughput OLTP, low throughput – fast response time OLTP, a small Data Warehouse and Grid Control. The systems are 24 x 7 x 365, but a nightly maintenance window can be arranged on some days of the week between midnight and 3am. Not a large window, but invaluable.

What are we migrating?

For a successful SAN migration, we first need to understand what we are migrating. To do this we need to profile the I/O requirement of each Oracle database. Oracle provides us with all of the tools we need to be able to do this. If you have the Tuning and Diagnostics Pack licensed (which all Oracle sites should have – it should really be part of base product.) then you can pull the information you need directly out of the AWR table DBA_HIST_SYSTEM_EVENT. By default your AWR stats gather every hour and only last a week. Hourly averages of system event wait times will give a good starting point, but I would not want to use them to determine peak volumes. If you capacity plan to your averages, you will run out of resources before you hit your peaks. A lot can happen in an hour when you are looking at millisecond response times. You need to be much more granular than an hourly average, but you might not want change AWR to gather all of its information as frequently as you need for the I/O events. You need to develop some code to gather the information from v$system_event directly, so I did [I will post this code up shortly]. You can be as granular as you want, but I balanced peaks against volume of information and the impact of gathering it and was running every 5 minutes. Mostly. I also took some much more frequent samples to ensure that I wasn’t wildly wrong with the 5 minute averages, and that I wasn’t experiencing very short peaky loads. This method also has the advantage of not needing any additional pack licensing from Oracle. Sorry Larry.

What are the resources available on the new SAN?

The second requirement for a SAN migration is to understand the capabilities of your new SAN. There were already some systems on the new SAN so I was unable to test to destruction. However, some dedicated resources were available on the shared SAN, namely dedicated disks, cache and front-end ports. I needed to throw substantial workloads at the SAN to stress the resources and find the tipping point.  There are many workload tools available, such as Hammerora and RAT. Some are expensive, such as RAT. However, for my purposes I was only looking to stress the SAN and get a comfortable feeling about the IOPS and throughput performance. This was achieved using Oracle’s free SAN stress tool, Orion [I will hopefully post more details about this shortly]. By taking all of the profiled I/O rates, adding them up and comparing against the new SAN and it’s much greater cache capacity (16GB dedicated to my systems, up from 4GB on the EMC’s), we were able to see that the footprint of the databases to be migrated fitted comfortably into the capabilities of the new SAN. And so the migrations began.

Thin Provisioning

To provide resilience for the databases, LUNs were presented from each SAN to each cluster node, and were mirrored at the Operating System level, either using native mirroring on the Unix platforms, or Veritas on the Windows platforms. This allows either SAN (or SAN location) to fail with no loss of service. The LUNs were either RAID-10 or RAID-5, and were carved out of the storage pools using Thin Provisioning.

Thin Provisioning (or using Meta-Luns or similar techniques from your SAN provider) is a way to allow LUN’s to be easily carved out of storage pools and extended at a later date. They are wonderful for SAN administrators and people who have not capacity planned their systems due to their dynamic nature and minimization of space wastage. However, they work on the basis of providing a bit of storage to your LUN from every disk in the array. This gives a maximum concurrency, maximum contention model. Every LUN is on the same disk competing for the same resource. Lots of IOPS, but everyone wants them. The Thin Provisioning also adds a small overhead to the I/O processing. Not much. Not even a whole milli-second.

Migration Problem

The migrations progressed well, from least critical to most critical system. As each system was migrated, we kept monitoring and measuring the I/O response times. Migrating the Data Warehouse showed that the new SAN was performing as expected. Processing times were lower. Not much, but in line with expectation and allocated resources. However, Grid Control was exhibiting some strange response times. Despite theoretically having more resources, some I/O response time metrics were worse on average, and much more variable than before. With the EMC SAN’s, we  had experienced a good, consistent level of performance for each LUN. The XP’s were proving to be more erratic.

I did a lot of investigation, moving resources around (e.g. dedicated front-end ports for the Grid Control LUN’s.) and measuring the affect. The level of I/O associated with Grid Control, with only a couple of hundred servers under management, was substantially lower than that of systems competing for shared resources (e.g. the Data Warehouse, which was exponentially more demanding). It seemed that Grid Control’s LUNs were not able to survive in the SAN cache. The old EMC cx700’s were relatively “dumb” compared to the new XP20000’s, and had effectively dedicated an amount of  cache to each LUN. The more intelligent XP’s would not be so potentially wasteful of resources, and so deemed the (relatively) low level of activity from the Grid Controls LUN’s to be unimportant and aged the blocks out.

We could live with this for Grid Control. Blisteringly fast response time is not a critical factor in the success of Grid Control. The same could not be said for one of our mission critical applications. It requires a very low end-to-end response time for each transaction. We carefully measure each step – client, calculation, render, network latency, database response time, etc. It needs to be fast and consistent. From the database, it really needs to hit cache or SSD’s. Spinning rust (ordinary disk) is simply too slow. However, the level of activity for this system, in terms of MB/s and IOPS, is relatively low compared to all of the other systems competing for SAN resources. About 3% of the whole. Add to this the overhead of Thin Provisioning and we have a problem. When a 2ms average turns into a nearly 4ms average with much higher peaks, as it did for db_file_sequential_read on Grid Control, that would be a major problem for this application.

Solution

Talking to HP, they would only guarantee the response time we demand by using SSD’s and a cost that made my eyes spin and think of buying a Supercar instead. A more practical (i.e. cheaper) solution had to be found, and it was (with the help amd support of HP, just no guarantees 🙂 ). We bought some more conventional disk and configured it into a traditional Raid-10 storage pool. No Thin Provisioning. We then partitioned the 16GB SAN cache into a 4GB cache dedicated to this system, and left the 12GB cache behind for everything else. We were migrating from 4GB, and understood that we had plenty of wriggle room in this configuration.

And the result? After migration, the performance of the mission critical low throughput system improved substantially, with consistent log_file_parallel_write times of less than 2ms and db_file_sequential_read times of less than 1ms, better than we were experiencing on the old reliable EMC’s. I mean, less than 1ms? That’s fast.

I/O Response Times – Graph

The 10046 trace. Largely useless, isn’t it?

The other night I was sat in the pub with some like-minded individuals discussing the relative merits of the 10046 trace (we Rock! in the pub, dudes!) and somebody asked me how often I has actually used it in anger? A well-respected DBA / Architect maintained it was a pretty useless and difficult option to use, given the topology of modern applications (e.g. How do you find the correct session with all that connection pooling going on from multiple web servers.)

My answer surprised me – I thought back to one client where I spent 90% of my time performance tuning a large (TiB-ish) OLTP/Batch hybrid system and concluded that I had ran a 10046 against production about once a year. Once. So if the 10046 is the holy grail of plan information, why wasn’t I using it that much. And why did I never use a 10053 against Production there?

The answer for me is a little more complex than that given in the pub:

1. as stated above, it’s hard to catch the in-flight session unless the application is instrumented to inject the trace statement when needed (and how many applications are instrumented to help you discover problems? Screen ST03 in SAP is very helpful. Any others in major ERP’s? Thought not.)

2. In many places that I have worked, getting authorisation to make any a change to a 24×7 mission-critical system is highly bureaucratic, involving cast-iron justification for the change and it’s positive benefits, requirement that there will be no adverse effects because of the change, very senior sign-off, more red-tape, etc. This causes a significant amount of work simply to put a trace on, even if you can catch the SQL. This can end up being more work than actually fixing the problem.

3. An awful lot of SQL tuning is a fairly blunt affair, as the developer (who is frequently database-blind) has usually missed something obvious. It is frequently to do with incorrectly using or not using an index (or using a poor index), or lack of filtering data at the right point to minimise the I/O.

4. Most importantly, if you have AWR and ASH, it’s not really needed. For each plan created by the optimizer the database stores the bind variables along with it, so we can usually understand why the optimizer makes the decisions it makes. ASH contains the main event waits. Why bother trying to capture all of the detail in a trace when you really don’t need that much detail, and it’s all already there; ready to be extracted from the relevant tables (e.g. dba_hist_active_sess_history, dba_hist_sql_plan and dba_hist_sql_bind.)

I have never used a 10053 trace on a Production system. I have simply never needed to know the decisions taken by the optimizer in that much detail. Like most DBA’s and Oracle consultants, I don’t go from site-to-site on a weekly basis resolving edge-case problems that the incumbent DBA’s haven’t had the time, or possibly don’t have the skills, to resolve themselves. I usually don’t need that level of confirmation that I’m right about why the plan is wrong, and I don’t have the time to conclusively prove it over and over again – I just need to get the fix into place and move onto the next problem.

That said, perhaps I should get fully to the bottom of these problems to ensure that they never occur again – which is the fundamental problem with Adaptive Cursor Sharing.