Online Index Rebuild Problem in 12C

When building an index in Oracle Enterprise edition using the keyword “ONLINE”, if the index build fails for whatever reason a messy set of extents can get left behind.

In Oracle 12.1 there’s a new job to help with this: “SYS”.”CLEANUP_ONLINE_IND_BUILD” which ticks away, undocumented, in the background cleaning up “rogue” extents once an hour, every hour (start time based upon job creation time)However, if I run a CREATE INDEX … ONLINE command on a table and the index creation happens to coincide with the invocation of the cleanup job I get the following error in my alert log

ADR Home = /u01/app/oracle/diag/rdbms/orcl/orcl1:
*************************************************************************
2017-05-12 05:28:37.191123 -04:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_j000_12235.trc:
ORA-12012: error on auto execute of job "SYS"."CLEANUP_ONLINE_IND_BUILD"
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
ORA-06512: at "SYS.DBMS_PDB", line 76
ORA-06512: at line 4

Oh no! The job has failed to clean up the extents that I was currently using. Phew!

The CREATE INDEX … ONLINE command completed with success a few minutes later.

A few minutes after that, an update statement failed with:

ORA-08102 index key not found for obj# 107102 file 16 block 1171234

Yes, I have a corrupt index! index_fadeMy assumption is that the cleanup job had removed all of the extents which had been allocated before the cleanup job started. The index create statement did not notice as the current extent removal attempt blocked (hence the job error) and future to-be created extents were not affected. However, most of the index had been “cleaned up”. #sadface

Solution: re-create the index and make sure the index creation is complete before your hourly job kicks in. Better still, disable the undocumented hourly job… make sure get the full backing of an Oracle Support SR first!

Next Step: Raise SR with Oracle about this!

 

Stats Collection Time Anomaly

Johnathan Lewis (@JLOracle) recently published a short post about Stats Collection Time, talking about the table dba_optstat_operation (and dba_optstat_operation_tasks ), which reminded me about (what I regard as) an anomaly in the output in the NOTES columns in Oracle 12C.

I won’t repeat why it’s useful to check these tables as Johnathans note and @MDWidlakes’s comment here should give you all you need to know.

The DBA_OPTSTAT_OPERATION.NOTES column contains the parameters passed into the DBMS_STATS command, so you know what was done. It also reports the DEFAULT used by the DBMS_STATS job. Well, it does if you call DBMS_STATS explicitly, but the standard overnight auto job just says “default”. Why doesn’t is expand on that the way the explicit call does? If the default was changed between runs, you may end up with very different results but with no indication why. Am I missing something?

The following 2 rows of data show the output from each run. Note that the DEFAULT for METHOD_OPT in this database has been changed from “FOR ALL COLUMNS SIZE AUTO” to “FOR ALL COLUMNS SIZE REPEAT”** but was not explicitly passed-in for either run.

DBMS_STATS.GATHER_SCHEMA_STATS – decodes the DEFAULTs

OPERATION : gather_schema_stats            
TARGET    : MYSCHEMA     
START_TIME: 15-SEP-16 07.04.47 
END_TIME  : 15-SEP-16 07.09.02 
STATUS    : COMPLETED                                   
JOB_NAME  : 
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="NULL"/>
            <param name="estimate_percent" val="DBMS_STATS.AUTO_SAMPLE_SIZE"/>
            <param name="force" val="FALSE"/>
            <param name="gather_fixed" val="FALSE"/>
            <param name="gather_temp" val="FALSE"/>
            <param name="granularity" val="AUTO"/>
            <param name="method_opt" val="FOR ALL COLUMNS SIZE REPEAT"/>
            <param name="no_invalidate" val="NULL"/>
            <param name="options" val="GATHER"/>
            <param name="ownname" val="MYSCHEMA"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="statid" val=""/>
            <param name="statown" val=""/>
            <param name="stattab" val=""/>
            <param name="stattype" val="DATA"/>
            </params>

Autotask Overnight Gather – doesn’t decode the DEFAULTs

OPERATION : gather_database_stats (auto)   
TARGET    : AUTO       
START_TIME: 15-SEP-16 22.01.20 
END_TIME  : 15-SEP-16 22.38.40 
STATUS    : COMPLETED            
JOB_NAME  : ORA$AT_OS_OPT_SY_1212  
NOTES     : <params>
            <param name="block_sample" val="FALSE"/>
            <param name="cascade" val="NULL"/>
            <param name="concurrent" val="FALSE"/>
            <param name="degree" val="DEFAULT_DEGREE_VALUE"/>
            <param name="estimate_percent" val="DEFAULT_ESTIMATE_PERCENT"/>
            <param name="granularity" val="DEFAULT_GRANULARITY"/>
            <param name="method_opt" val="DEFAULT_METHOD_OPT"/>
            <param name="no_invalidate" val="DBMS_STATS.AUTO_INVALIDATE"/>
            <param name="reporting_mode" val="FALSE"/>
            <param name="stattype" val="DATA"/>
            </params>

 

 

 

**as it should be in EVERY ORACLE DATABASE EVER from the start, to allow you to control the histograms that you need and need to maintain on your schema. The Oracle default approach of “everything is skewed, thousands of histograms everywhere please” is particularly painful for OLTP databases using Bind Variable. I’m sure some of Oracles Adaptive Optimization is to work around the bad things that happen under this particular scenario.

 

Accessing STATUS columns efficiently

A frequently reoccuring design problem with relational databases is the issue locating unprocessed rows in a large table, so we know which rows of data are still yet to be processed.

The problem with a STATUS column is that it generally has low cardinality; there are probably only a handful of distinct values [(C)omplete, (E)rror, (U)nprocessed or something like that]. Most records will be (C)omplete. This makes STATUS a poor candidate for standard B-Tree indexation. In a high throughput OLTP database, using bitmap indexes is probably not an option due to concurrency.

[Aside: When coding flag columns in Oracle, ALWAYS use a VARCHAR2(1 CHAR) {or CHAR(1 CHAR) if you prefer, but a CHAR is a VARCHAR2 under the covers and occupies the same number of bytes}. This is in preferance to a NUMBER(1). which occupies more bytes for a “1” than a “0”, so when you update it, you run the risk of row migration, chained rows and a performance hit. Frequently, ORM’s like Hibernate code for NUMBER by default. Override this!]

So what are my options? There’s a short list of possible table accesses for a low cardinality column.

1. Table scan. In an OLTP database where you only want a tiny fraction of the rows in the table, this would be a bad chouce.
2. Index the accessed columns and accept the inevitable INDEX_SCAN or FAST_FULL_INDEX_SCAN. This is not great and you probably need a Histogram on the column to convince the optimizer to use the index for your low frequency values. Otherwise you may be back to the table scan.
3. Make the “Complete” status “NULL”.
4. Uses a function-based index which makes the Complete status seems to be NULL for a specific query.

So what’s with options 3 and 4, why are they good, and how do we use them?

Unlike some RBDMS’s, Oracle does not store NULL values in it’s simple (non-composite) b-tree indexes. Therefore, if you choose Option (3) and make your “Complete” status be represented by a NULL, you will maintain an index on STATUS in which the only values that are stored are values you are interested in. This makes the index very sexy to the optimizer as it will generally be very tiny. However, we face one small problem. Convincing Developers that having a NULL as a valid status can be difficult. A NULL is a non-representative value. It is not supposed to represent anything. It means “I don’t know”. It doesn’t behave the same an normal values. This tends to freak out Developers and designers sometimes.

That’s where Option 4 comes in. If we wrap the index definition in a CASE statement, to produce a function-based index, we have have a highly specific tailored index on our table. If the SQL predicate matches the query exactly, we get a serious performance payoff.

But don’t take my word for it. Here’s a worked example from my laptop:

 
Here’s the table, it’s data distribution (16m rows, and a handful we care about)

NEIL @ ORCL01 > desc test_table
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                            NOT NULL NUMBER
 STATUS                        NOT NULL VARCHAR2(1 CHAR)
 DESCRIPTION                   NOT NULL VARCHAR2(100 CHAR)

NEIL @ ORCL01 > select status,count(*) from test_table group by status

S   COUNT(*)
- ----------
E         16
C   16777216
Y         32

 
Here are the indexes on the table, and their sizes. As you can see, the function-based index is absolutely tiny, making it as attractive to storage admins as it is to the optimizer.

- alter table test_table add constraint test_table_pk primary key (id);
- create index test_table_CASE on test_table (case status when 'Y' then status else null end);
- create index test_table_COVER_COMP on test_table (status, id) compress 1;
- create index test_table_STATUS on test_table (status) compress 1;



NEIL @ ORCL01 > select segment_name,segment_type,sum(bytes/1024) kb from user_extents 
where segment_name like 'TEST_TABLE%' 
group by segment_type,segment_name order by 2 desc,1;

SEGMENT_NAME               SEGMENT_TYPE               KB
-------------------------- ------------------ ----------
TEST_TABLE                 TABLE                  555008
TEST_TABLE_CASE            INDEX                      64
TEST_TABLE_COVER_COMP      INDEX                  658432
TEST_TABLE_PK              INDEX                  319488
TEST_TABLE_STATUS          INDEX                  413696

Some Index stats:
INDEX_NAME                DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTOR STATUS     NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------- ------------- ----------------------- ----------------------- ----------------- -------- ---------- ----------- ---------
TEST_TABLE_CASE                       1                       1                       6                 6 VALID            32          32 21-FEB-16
TEST_TABLE_COVER_COMP          16748149                       1                       1            125447 VALID      16748149      234974 21-FEB-16
TEST_TABLE_PK                  17003239                       1                       1             91391 VALID      17003239      492287 21-FEB-16
TEST_TABLE_STATUS                     3                   13828                   32011             96034 VALID      16257590      363295 21-FEB-16

 
Where we have a choice of useful indexes, we get a FAST FULL SCAN with a hefty cost. A histogram could have given us an index RANGE SCAN, which can be very good.
With no Histogram:

select id from test_table where status = 'Y';

Plan hash value: 1140618830

----------------------------------------------------------------------------------------------
| Id  | Operation            | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                       |       |       | 18753 (100)|          |
|*  1 |  INDEX FAST FULL SCAN| TEST_TABLE_COVER_COMP |  5592K|    42M| 18753   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

 
With a histogram in place on STATUS, you get a much better plan as the covering index avoids the need for the table look-up. You also get the risk that the optimizer may have bind variable peeking issues and other complications should we have lots of table joins.

select id from test_table where status = 'Y'

Plan hash value: 2912582684

------------------------------------------------------------------------------------------
| Id  | Operation        | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                       |       |       |     3 (100)|          |
|*  1 |  INDEX RANGE SCAN| TEST_TABLE_COVER_COMP |    32 |   256 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

NOTE: Ditching the covering index and just using the index on STATUS is pretty efficient too when combined with a histogram:

select id from test_table where status = 'Y'

Plan hash value: 2416598805

---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE        |    32 |   256 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_STATUS |    32 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


 
And now with the function-based index; having the case statement removing all statuses we are not interested-in for a tiny tidy index.

NOTE: The Predicate in the query must EXACTLY match the function-based index for it to be used.

select id from test_table where case status when 'Y' then status else null end = 'Y'

Plan hash value: 2073004851

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                 |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE      |    32 |   256 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TEST_TABLE_CASE |    32 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Conclusion: For a highly skewed STATUS column you need a histogram, which is something you should mostly avoid in OLTP systems using BIND variables. Having a highly focussed function-based index allows for a tiny self-maintaining index which is guaranteed to only be used for queries that you want it to be used for.
 

NOTE: The original idea behind using NULLS to minimise index size came from the performance expert, Jonathan Lewis. I have implemented both NULL-as-complete design and case-based indexes at several clients, in varying forms, and always to great success.

Primarys Keys and their supporting indexes

Sometimes things just happen which makes you realise that stuff you thought all experienced DBA’s knew about isn’t common knowledge. I suppose it’s a side effect of working with the same evolving product for over 25 years at dozens of clients, when the colleague you are talking to has a mere decade of experience at fewer companies (and therefore less diversity of challenge).

Today I noticed that a release was creating a table, then an index, and then adding a Primary Key based upon the index. Pretty normal, but the indexes were non-unique. Interesting. Is that valid? (yes) It is sensible? (under some circumstances, also yes). Is it wrong but nobody noticed until it got to me? Probably.

However, there’s more variables at play here than first meets the eye. The Primary Key is being enforced by a pre-existing non-unique index, rather than create its own index “dynamically” (which would be unique under most circumstances). So therefore we have a few questions to answer

  1. Is the index being used optimal?
  2. How do I know it’s using a pre-created index and the constraint did not create its own index?
  3. What happens if I drop the PK constraint?

 

Scenario 1: create table, create constraint

You get a nice neat fast index enforcing the constraint. If you drop the constraint, the index will also get removed. This may or may not be what you want. You may need to put a new index back on the table…

Scenario 2: create table, create unique index, create constraint

You get a nice neat fast index enforcing the constraint. If you drop the constraint, the index does not get removed. This may or may not be what you want. The index which was left behind might cause a problem if you miss the fact it’s still around…

Scenario 3: create table, create non-unique index (same as constraint columns), create constraint

The index enforcing the constraint is slightly less efficient enforcing the constraint as it has to optimize to an index range scan, not a unique scan. However, if you drop the constraint you still have a non-unique index. This may be a very useful ability for bulk data loads with questionable data, although the use-cases are limited and tend to be Warehouse-centric.

If you have a deferrable constraint, you must have a non-unique index to enforce the constraint as there may temporarily be duplicate keys in the table part-way through the transaction.

Scenario 4: create table, create non-unique composite index, create constraint which uses leading columns of that index but not all of the columns

The index may be somewhat less efficient enforcing the constraint by having less rows per block and possibly more levels, but this inefficiency may help overall system efficiency by having to maintain less indexes – maintaining 2 indexes instead of 1 is a 100% overhead in index maintenance. The composite index cannot be a unique index.

 

Some worked example showing the above scenarios (ran on Oracle 12.1.0.2.0 in a PDB) :

NEIL @ ORCL01 > -- Scenario 1
NEIL @ ORCL01 > drop table neil;
Table dropped.
NEIL @ ORCL01 > create table neil (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > alter table neil add constraint neil_pk primary key (col_id) ;
Table altered.
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_PK              UNIQUE
NEIL @ ORCL01 > insert into neil values (1,'A');
1 row created.
NEIL @ ORCL01 > insert into neil values (1,'B');
insert into neil values (1,'B')
*
ERROR at line 1:
ORA-00001: unique constraint (NEIL.NEIL_PK) violated
NEIL @ ORCL01 > commit;
Commit complete.
NEIL @ ORCL01 > alter table neil drop constraint neil_pk;
Table altered.
NEIL @ ORCL01 > -- And the index is gone...
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
no rows selected
NEIL @ ORCL01 > -- Scenario 2
NEIL @ ORCL01 > drop table neil;
Table dropped.
NEIL @ ORCL01 > create table neil (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > create unique index neil_i on neil(col_id);
Index created.
NEIL @ ORCL01 > alter table neil add constraint neil_pk primary key (col_id) ;
Table altered.
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I               UNIQUE
NEIL @ ORCL01 > insert into neil values (1,'A');
1 row created.
NEIL @ ORCL01 > insert into neil values (1,'B');
insert into neil values (1,'B')
*
ERROR at line 1:
ORA-00001: unique constraint (NEIL.NEIL_PK) violated
NEIL @ ORCL01 > commit;
Commit complete.
NEIL @ ORCL01 > alter table neil drop constraint neil_pk;
Table altered.
NEIL @ ORCL01 > -- And the index survives
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I               UNIQUE
NEIL @ ORCL01 > -- Scenario 3
NEIL @ ORCL01 > drop table neil;
Table dropped.
NEIL @ ORCL01 > create table neil (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > create index neil_i on neil(col_id);
Index created.
NEIL @ ORCL01 > alter table neil add constraint neil_pk primary key (col_id) using index neil_i;
Table altered.
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I               NONUNIQUE
NEIL @ ORCL01 > insert into neil values (1,'A');
1 row created.
NEIL @ ORCL01 > insert into neil values (1,'B');
insert into neil values (1,'B')
*
ERROR at line 1:
ORA-00001: unique constraint (NEIL.NEIL_PK) violated
NEIL @ ORCL01 > commit;
Commit complete.
NEIL @ ORCL01 > alter table neil drop constraint neil_pk;
Table altered.
NEIL @ ORCL01 > -- And the index survives
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I               NONUNIQUE
NEIL @ ORCL01 > -- Scenario 4
NEIL @ ORCL01 > drop table neil;
Table dropped.
NEIL @ ORCL01 > create table neil (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > create index neil_i_composite on neil(col_id,col_txt);
Index created.
NEIL @ ORCL01 > alter table neil add constraint neil_pk primary key (col_id) using index neil_i_composite;
Table altered.
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I_COMPOSITE     NONUNIQUE
NEIL @ ORCL01 > insert into neil values (1,'A');
1 row created.
NEIL @ ORCL01 > insert into neil values (1,'B');
insert into neil values (1,'B')
*
ERROR at line 1:
ORA-00001: unique constraint (NEIL.NEIL_PK) violated
NEIL @ ORCL01 > commit;
Commit complete.
NEIL @ ORCL01 > alter table neil drop constraint neil_pk;
Table altered.
NEIL @ ORCL01 > -- And the index survives
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name = 'NEIL';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_I_COMPOSITE     NONUNIQUE

So how do I tell if, when I drop my constraint, I will also drop the index? This does not appear to be exposed in any DBA_ view of which I am aware, so I have had to rely upon this post from Jonathan Lewis to help me decipher the information. You need to see if certain bits are set in the column sys.ind$.property. Here’s my worked example to show the information we seek:

NEIL @ ORCL01 > drop table neil_cons_create;
Table dropped.
NEIL @ ORCL01 > create table neil_cons_create (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > alter table neil_cons_create add constraint neil_cons_create_pk primary key (col_id);
Table altered.
NEIL @ ORCL01 > drop table neil_pre_create;
Table dropped.
NEIL @ ORCL01 > create table neil_pre_create (col_id number, col_txt varchar2(10 char));
Table created.
NEIL @ ORCL01 > create unique index neil_pre_create_i on neil_pre_create(col_id);
Index created.
NEIL @ ORCL01 > alter table neil_pre_create add constraint neil_pre_create_pk primary key (col_id);
Table altered.
NEIL @ ORCL01 > select index_name,uniqueness from user_indexes where table_name like 'NEIL%';
INDEX_NAME           UNIQUENESS
-------------------- --------------------
NEIL_CONS_CREATE_PK  UNIQUE
NEIL_PRE_CREATE_I    UNIQUE

select do.owner,do.object_name,do.object_id,ind.property, 'Keep Index' action
  from sys.ind$ ind
       ,dba_objects do
where do.object_name like 'NEIL%'
  and do.object_id = ind.obj#
  and bitand(ind.property,4097) = 4097
union all
select do.owner,do.object_name,do.object_id,ind.property, 'Drop Index' action
  from sys.ind$ ind
      ,dba_objects do
where do.object_name like 'NEIL%'
  and do.object_id = ind.obj#
  and bitand(ind.property,4097) <> 4097
order by 1,2,3
/
OWNER        OBJECT_NAME         OBJECT_ID  PROPERTY   ACTION
------------ ------------------- ---------- ---------- --------------------
NEIL         NEIL_CONS_CREATE_PK      93814       4097 Keep Index
NEIL         NEIL_PRE_CREATE_I        93816          1 Drop Index

Sequences – a quick guide and an unexpected quirk

Sequences on Oracle databases are simple, but there are some basic truths about sequences which need to be understood if you are going to use them, especially on RAC:

  1. Sequences will have “holes” in them, even if you specify NOCACHE (e.g. if you rollback your transaction).
  2. Don’t specify NOCACHE. It doesn’t scale.
  3. Sequences will NOT necessarily be, erm, sequential for your transactions. That will depend upon the COMMIT order and if you are using RAC.
  4. The higher the CACHE value, the bigger the holes but the more scalable the sequence will be become. You will lose the entire set of unused cache values across instance restart.
  5. Oracle stores the CACHE value on EACH RAC node, so for a cache of 100,000, NODE 1 gets 1-100,000, NODE 2 get 100,001-200,000 to play with, and so on.
  6. If you are using RAC, and you have a heavy workload on one node and a low number of transactions on another node, there may be a significant disparity between the NEXTVAL on each node. I have seen this disparity manifest in the tens of millions.
  7. There is a global cache lock which must be passed from node to node when getting a new cache of sequence numbers on that node. It is lightweight, but it is a serialisation point. If your CACHE value is too low for your throughput, you will get contention on this lock.
  8. Same as the serialisation point if you specify ORDERED, so you get the sequences advancing in step (rather than each node having its own cache) on multiple RAC nodes. Don’t do this either.
  9. The point if sequences is to give you a unique value for a (surrogate) Primary Key. You can rely upon sequences to be unique. Well. Mostly. As long as they do not CYCLE.
  10. If you use sequences for a PK, you will probably end up with a right-hand index with hot-block contention during inserts, causing you to implement something suboptimal, like hash partitioning or reverse-key indexes to get around your design flaw.

Sequences on RAC can behave in unexpected ways. The below example is a little contrived but shows a potentially unexpected behaviour caused by CYCLEing your sequences:

NODE 1 NODE 2
create sequence seq_1
start     with 1
increment by   1
maxvalue       10
cache          2
cycle;

Sequence created.

NODE1 :select seq_1.nextval from dual;
NEXTVAL: 1
NODE2 :select seq_1.nextval from dual;
NEXTVAL
3 <- Gets next value above cache on node 1
NODE1 : /
NEXTVAL: 2
NODE1 : /
NEXTVAL: 5 (skipping 3 and 4 on Node 2)
6
7
8
9
10
1
2
3  <- erm. Hang on. Node 2 has this!
4  <- and this!!!
5
6
7
8
9
10
1
NODE2 : /
NEXTVAL
4 <- Reading from the local cache
NODE2 : /
NEXTVAL
3 <- And get a new cache value

So, if you are using sequences and you can’t guarantee sequence and you can’t guarantee no gaps and, as this demo shows, you can’t guarantee uniqueness, and they cause hot block right hand index problems, tell me again why you are using them?

Because they are fast and you don’t CYCLE them.

However, sequences are becoming the new evil in the database, right behind triggers.

Adding a DEFAULT column in 12C

I was at a talk recently, and there was an update by Jason Arneil about adding columns to tables with DEFAULT values in Oracle 12C. The NOT NULL restriction has been lifted and now Oracle cleverly intercepts the null value and replaces it with the DEFAULT meta-data without storing it in the table. To repeat the 11G experiment I ran recently:

 

SQL> alter table ncha.tab1 add (filler_default char(1000) default 'EXPAND' not null);
Table altered.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len 
      from user_tables where table_name = 'TAB1';
TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       1504          0        2017


In both releases we then issue:
SQL> alter table ncha.tab1 modify (filler_default null);
Table altered.


IN 11G
SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
      from user_tables where table_name = 'TAB1';

TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       3394          0        2017

BUT IN 12C
SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
      from user_tables where table_name = 'TAB1';
TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       1504          0        2017

So, as we can see, making the column NULLABLE in 12C didn’t cause it to go through and update every row in the way it must in 11G. It’s still a chained-row update accident waiting to happen, but its a more flexible accident 🙂

However, I think it’s worth pointing out that you only get “free data storage” when you add the column. When inserting a record, simply having a column with a DEFAULT value means that the DEFAULT gets physically stored with the record if it is not specified. The meta-data effect is ONLY for subsequently added columns with DEFAULT values.

SQL> create table ncha.tab1 (pk number, c2 timestamp, filler char(1000), filler2 char(1000) DEFAULT 'FILLER2' NOT NULL) pctfree 1;
Table created.

SQL> alter table ncha.tab1 add constraint tab1_pk primary key (pk);
Table altered.

Insert 10,000 rows into the table, but not into FILLER2 with the DEFAULT
SQL> insert into ncha.tab1 (pk, c2, filler) select rownum id, sysdate, 'A' from dual connect by level <= 10000;
commit;
Commit complete.

Gather some stats and have a look after loading the table. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);
PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
     from user_tables where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	    3394	  0	   2017

For a bit of fun, I thought I would see just how weird the stats might look if I played around with adding defaults

SQL> drop table ncha.tab1;
Table dropped.

SQL> create table ncha.tab1 (pk number) pctfree 1;
Table created.

SQL> alter table ncha.tab1 add constraint tab1_pk primary key (pk);
Table altered.

Insert 10,000 rows into the table

SQL> insert into ncha.tab1 (pk) select rownum id from dual connect by level <= 10000;
commit;
Commit complete.

Gather some stats and have a look after loading the table. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
  2    from user_tables
  3   where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	      20	  0	      4

Now lets add a lot of defaults
SQL> alter table ncha.tab1 add (filler_1 char(2000) default 'F1' not null, filler_2 char(2000) default 'F2' null, filler_3 char(2000) default 'F3', filler_4 char(2000) default 'how big?' null );
Table altered.

Gather some stats and have a look after adding the column. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
  2    from user_tables
  3   where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	      20	  0	   8008

10,000 rows with an AVG_ROW_LEN of 8008, all in 20 blocks. Magic!

Just to finish off, lets update each DEFAULT column so the table expands….

SQL> select filler_1, filler_2, filler_3, filler_4,count(*) from ncha.tab1 group by filler_1,filler_2,filler_3,filler_4;

FILLER_1   FILLER_2   FILLER_3	 FILLER_4     COUNT(*)
---------- ---------- ---------- ---------- ----------
F1	   F2	      F3	 how big?	 10000

So it's all there. The metadata is intercepting the nulls and converting them to the default on the fly, rather than storing them in the blocks.
So what happens if we actually UPDATE the table?

SQL> update ncha.tab1 set filler_1 = 'EXPAND', filler_2 = 'EXPAND', filler_3='EXPAND', filler_4='THIS BIG!';
10000 rows updated.

SQL> select filler_1, filler_2, filler_3, filler_4,count(*) from ncha.tab1 group by filler_1,filler_2,filler_3,filler_4;

FILLER_1   FILLER_2   FILLER_3	 FILLER_4     COUNT(*)
---------- ---------- ---------- ---------- ----------
EXPAND	   EXPAND     EXPAND	 THIS BIG!	 10000

Gather some stats and have a look after the update, checking for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
     from user_tables where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	   19277	  0	   8010

SQL> 
SQL> analyze table tab1 list chained rows into chained_rows;

Table analyzed.

SQL> select count(*) CHAINED_ROWS from chained_rows;

CHAINED_ROWS
------------
       10000

Yep. That’s bigger.

%d bloggers like this: