Adding a DEFAULT column in 12C

I was at a talk recently, and there was an update by Jason Arneil about adding columns to tables with DEFAULT values in Oracle 12C. The NOT NULL restriction has been lifted and now Oracle cleverly intercepts the null value and replaces it with the DEFAULT meta-data without storing it in the table. To repeat the 11G experiment I ran recently:

 

SQL> alter table ncha.tab1 add (filler_default char(1000) default 'EXPAND' not null);
Table altered.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len 
      from user_tables where table_name = 'TAB1';
TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       1504          0        2017


In both releases we then issue:
SQL> alter table ncha.tab1 modify (filler_default null);
Table altered.


IN 11G
SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
      from user_tables where table_name = 'TAB1';

TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       3394          0        2017

BUT IN 12C
SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
      from user_tables where table_name = 'TAB1';
TABLE_NAME NUM_ROWS       BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1            10000       1504          0        2017

So, as we can see, making the column NULLABLE in 12C didn’t cause it to go through and update every row in the way it must in 11G. It’s still a chained-row update accident waiting to happen, but its a more flexible accident 🙂

However, I think it’s worth pointing out that you only get “free data storage” when you add the column. When inserting a record, simply having a column with a DEFAULT value means that the DEFAULT gets physically stored with the record if it is not specified. The meta-data effect is ONLY for subsequently added columns with DEFAULT values.

SQL> create table ncha.tab1 (pk number, c2 timestamp, filler char(1000), filler2 char(1000) DEFAULT 'FILLER2' NOT NULL) pctfree 1;
Table created.

SQL> alter table ncha.tab1 add constraint tab1_pk primary key (pk);
Table altered.

Insert 10,000 rows into the table, but not into FILLER2 with the DEFAULT
SQL> insert into ncha.tab1 (pk, c2, filler) select rownum id, sysdate, 'A' from dual connect by level <= 10000;
commit;
Commit complete.

Gather some stats and have a look after loading the table. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);
PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
     from user_tables where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	    3394	  0	   2017

For a bit of fun, I thought I would see just how weird the stats might look if I played around with adding defaults

SQL> drop table ncha.tab1;
Table dropped.

SQL> create table ncha.tab1 (pk number) pctfree 1;
Table created.

SQL> alter table ncha.tab1 add constraint tab1_pk primary key (pk);
Table altered.

Insert 10,000 rows into the table

SQL> insert into ncha.tab1 (pk) select rownum id from dual connect by level <= 10000;
commit;
Commit complete.

Gather some stats and have a look after loading the table. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
  2    from user_tables
  3   where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	      20	  0	      4

Now lets add a lot of defaults
SQL> alter table ncha.tab1 add (filler_1 char(2000) default 'F1' not null, filler_2 char(2000) default 'F2' null, filler_3 char(2000) default 'F3', filler_4 char(2000) default 'how big?' null );
Table altered.

Gather some stats and have a look after adding the column. Check for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
  2    from user_tables
  3   where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	      20	  0	   8008

10,000 rows with an AVG_ROW_LEN of 8008, all in 20 blocks. Magic!

Just to finish off, lets update each DEFAULT column so the table expands….

SQL> select filler_1, filler_2, filler_3, filler_4,count(*) from ncha.tab1 group by filler_1,filler_2,filler_3,filler_4;

FILLER_1   FILLER_2   FILLER_3	 FILLER_4     COUNT(*)
---------- ---------- ---------- ---------- ----------
F1	   F2	      F3	 how big?	 10000

So it's all there. The metadata is intercepting the nulls and converting them to the default on the fly, rather than storing them in the blocks.
So what happens if we actually UPDATE the table?

SQL> update ncha.tab1 set filler_1 = 'EXPAND', filler_2 = 'EXPAND', filler_3='EXPAND', filler_4='THIS BIG!';
10000 rows updated.

SQL> select filler_1, filler_2, filler_3, filler_4,count(*) from ncha.tab1 group by filler_1,filler_2,filler_3,filler_4;

FILLER_1   FILLER_2   FILLER_3	 FILLER_4     COUNT(*)
---------- ---------- ---------- ---------- ----------
EXPAND	   EXPAND     EXPAND	 THIS BIG!	 10000

Gather some stats and have a look after the update, checking for chained rows at the same time.
SQL> exec dbms_stats.gather_table_stats('NCHA','TAB1',null,100);

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks,avg_space,avg_row_len
     from user_tables where table_name = 'TAB1';

TABLE_NAME   NUM_ROWS	  BLOCKS  AVG_SPACE AVG_ROW_LEN
---------- ---------- ---------- ---------- -----------
TAB1		10000	   19277	  0	   8010

SQL> 
SQL> analyze table tab1 list chained rows into chained_rows;

Table analyzed.

SQL> select count(*) CHAINED_ROWS from chained_rows;

CHAINED_ROWS
------------
       10000

Yep. That’s bigger.

When to use the NOLOCK hint in SQL Server

I frequently hear of, and see, developers and DBA’s using the NOLOCK hint within SQL Server to bypass the locking mechanism and return their data sets as soon as possible. There are times when this is OK, such as when you are running an ad hoc query and are only interested in approximate results. It is somewhat less OK to write this hint into application code and reports, unless you don’t actually care whether the data returned is accurate.

The big problem with NOLOCK is that the effects of using it are not fully understood by many of the coders who are using it. The common perception is that you’re simply reading uncommitted data, and the odd roll-back isn’t too much to worry about. If that was the full extent of the problem, then the developer would be fairly right – we tend not to roll back too often so don’t worry about it. However, there are more insidious side effects which are not generally understood. Effects caused by how the underlying database actually works.

To try and explain the true nature of issuing READ UNCOMMITTED selects, via NOLOCK, I have created an example so you can see the issue at work.

Here’s 2 scripts. SCRIPT 1 creates a table, puts some static data into it, then starts inserting lots of data. Each row is padded for realism, to get a few rows per block. The Primary Key is a “UNIQUEIDENTIFIER”, so we should expect to get the keys spread, and subsequent inserts into the same blocks as our initial inserts. This should generate some block splits – something that happens a lot in SQL Server.
SCRIPT1:

IF OBJECT_ID('dbo.test_table') IS NOT NULL
DROP TABLE dbo.test_table;
-- create a table and pad it out so we only get a few rows per block
CREATE TABLE dbo.test_table
(
 pk UNIQUEIDENTIFIER DEFAULT ( NEWID() ) NOT NULL
,search_col VARCHAR(10)
,count_col  INT
,padding CHAR(100) DEFAULT ( 'pad' )
);
alter TABLE dbo.test_table add constraint test_table_pk primary key clustered (pk);
DECLARE @LOOP1 INT
SET @LOOP1=0
WHILE (@LOOP1 < 100)
BEGIN
 SET @LOOP1=@LOOP1+1
 INSERT INTO dbo.test_table ( search_col, count_col ) VALUES('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1),('THIS_ONE',1);
END;
select getdate(),sum(count_col) from dbo.test_table (NOLOCK) where search_col = 'THIS_ONE';
set nocount on
-- insert 100,000 rows, which should cause some lovely block splits as the PK will look to insert into the same block as the data we already have in there
-- we need to run the select in another windoow at the same time
DECLARE @LOOP INT
SET @LOOP=0
WHILE (@LOOP < 100000)
BEGIN
 SET @LOOP=@LOOP+1
 INSERT INTO dbo.test_table ( search_col, count_col ) VALUES ( CAST( RAND() * 1000000 AS CHAR) , 100000 )
END
select getdate(),sum(count_col) from dbo.test_table (NOLOCK) where search_col = 'THIS_ONE';

Output from SCRIPT1 – note that the 2 selects, before and after inserts, give the same output.

----------------------- -----------
2014-10-12 23:51:34.210 1000
---------------------- -----------
2014-10-12 23:51:53.490 1000

Whilst SCRIPT1 is running, run SCRIPT 2 in another window in the same database. It’s just repeating the same SELECT with (NOLOCK) over and over again. The WHERE clause doesn’t change, and the correct result set should never change… but due to the block splits we see it change. A lot. As the data from the block split is duplicated into the split block before cleanup on the old block, the NOLOCK, performing the READ UNCOMMITTED select sees the “data duplication” in the newly split block.
SCRIPT2:

set nocount on
DECLARE @LOOP INT
SET @LOOP=0
WHILE (@LOOP < 10000)
begin
 SET @LOOP=@LOOP+1
 select getdate(),sum(count_col) from dbo.test_table (NOLOCK) where search_col = 'THIS_ONE';
end;

Output from SCRIPT2 (trimmed)

2014-10-12 23:51:35.473 1000
.
2014-10-12 23:51:35.530 1000
2014-10-12 23:51:35.530 1000
2014-10-12 23:51:35.533 1005
2014-10-12 23:51:35.533 1000
2014-10-12 23:51:35.537 1000
2014-10-12 23:51:35.537 1000
2014-10-12 23:51:35.540 1003
2014-10-12 23:51:35.540 1000
2014-10-12 23:51:35.543 1001
2014-10-12 23:51:35.543 1000
2014-10-12 23:51:35.547 1000
2014-10-12 23:51:35.550 1000
2014-10-12 23:51:35.550 1000
2014-10-12 23:51:35.553 1000
2014-10-12 23:51:35.557 1006
2014-10-12 23:51:35.557 1003
2014-10-12 23:51:35.560 1000
2014-10-12 23:51:35.560 1000
.
2014-10-12 23:51:53.383 1000
2014-10-12 23:51:53.400 1000
2014-10-12 23:51:53.417 1004
2014-10-12 23:51:53.433 1001
2014-10-12 23:51:53.450 1000
2014-10-12 23:51:53.467 1002
2014-10-12 23:51:53.483 1000
2014-10-12 23:51:53.507 1000
Query was cancelled by user.

 
So, using the NOLOCK hint can return incorrect results, even if the data you are selecting is unchanged, unchanging, and NOT subject to rollback.
Locking is there for a reason. ACID transactions exist for a reason.
If you care about your data, you should try to access it correctly and treat it well, otherwise you have to ask if the code you are writing really has value. If it doesn’t have value, why are you storing the data in an expensive relational database, when you could use a freeware database engine or just pipe it straight to /dev/null – that’s really quick.

One solution to this problem is to change the locking method of SQL Server, and start using Read Committed Snapshot Isolation** mode. This allows readers to access the data without blocking writers or be blocked by writers. It works similarly to Oracle’s Multi-Version Concurrency Control, and (sweeping generalisation alert!) allows SQL Server to scale better.

**NOLOCK still “works” the same in this mode – it needs to be removed from your code.

Releasing to schemas the easy way

Sometimes we occasionally just miss the obvious, for years. Just noticed that an easy way to release code to a particular schema is to login as your normal DBA user (USER1) [as preferred by audit], use the alter session command to switch to point to the release schema (USER2) and run your DDL. Oracle behaves, from an object-owner perspective, as if you are logged-in as the schema owner without all the negative aspects of actually having to login as the schema owner or prefix all object with the schema owner.

This is particularly useful when you have multiple schemas within the database and you are releasing to more than one schema at once.

 

e.g.

sqlplus USER1@ORCL

SQL*Plus: Release 11.2.0.1.0 Production on Wed Sep 17 12:17:36 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter password:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

12:17:38 USER1@ORCL>select owner,table_name from dba_tables where table_name = 'N1';

no rows selected

12:17:46 USER1@ORCL>create table n1 (c1 date);

Table created.

12:17:56 USER1@ORCL>select owner,table_name from dba_tables where table_name = 'N1';

OWNER                          TABLE_NAME
------------------------------ ------------------------------
USER1                          N1

12:17:57 USER1@ORCL>alter session set current_schema=USER2;

Session altered.

12:18:08 USER1@ORCL>create table n1 (c1 date);

Table created.

12:18:11 USER1@ORCL>select owner,table_name from dba_tables where table_name = 'N1';

OWNER                          TABLE_NAME
------------------------------ ------------------------------
USER2                          N1
USER1                          N1

 

 

Developers Killing Sessions

When you end up spending a far great a percentage of your day than seems sensible killing off Java connections that Developers have carelessly left lying around, locking objects all over the place, you need a solution to get them to go away. The solution is to let them do it themselves!

I’m not advocating granting ALTER SYSTEM to Developers! That way madness lies, or certainly some unintended consequences. I’m all for Dev’s having a lot of freedom in the database, just not freedom with the database.

So, creating a stored procedure (in this example as sys, but as any user with an explicit ALTER SYSTEM privilege granted will do) to kill sessions without allowing too much latitude to do anything else seems appropriate. Here’s one I built earlier:

 

create or replace procedure sys.kill_session 
 ( p_sid IN number, p_serial IN number, p_instance IN number) as
 -- Neil Chandler. Grant the ability to kill session on a restricted basis. 21.07.2010
 l_username varchar2(30) := null;
 l_priv number := 1;
begin
 -- Who owns the session?
 select username into l_username
 from gv$session
 where sid = p_sid and serial#=p_serial and inst_id = p_instance;
 -- Check for DBA role
 select count(*) into l_priv
 from dba_role_privs
 where grantee = l_username and granted_role = 'DBA';
 -- If the user has the DBA priv, deny the kill request
 if l_priv > 0 or l_username is null
 then
 dbms_output.put_line 
  ('User request to kill session '||p_sid||','||p_serial||',@'||p_instance||
   ' denied. Session is for privileged user '||l_username||'.');
 else
 dbms_output.put_line 
   ('Killing user '||l_username||' - '||p_sid||','||p_serial||',@'||p_instance);
 execute immediate 'alter system disconnect session '''||
                    p_sid||','||p_serial||',@'||p_instance||
                   ''' immediate';
 end if;
end;
/

-- and let the proc be seen and used
create or replace public synonym kill_session for sys.kill_session; 
grant execute on kill_session to (whomever);

Then a nifty bit of sql to generate the kill commands for the Developers. 
Please include your favourite columns from gv$session:

select username,status,blocking_session,
 'exec kill_session ('|| sid||','||serial#||','||inst_id||')' Kill_Command 
 from gv$session
where username is not null
  and type <> 'BACKGROUND'
/

USERNAME  STATUS  BLOCKING_SESSION KILL_COMMAND
--------- ------- ---------------- ------------------------------------ 
SYS       ACTIVE                   exec kill_session (31,65,1)
SYSTEM    INACTIVE                 exec kill_session (58,207,1)
USER_1    INACTIVE                 exec kill_session (59,404,1)
USER_2    INACTIVE                 exec kill_session (72,138,1)
USER_2    INACTIVE                 exec kill_session (46,99,2)


May the odds be forever in your favour. Let the killing commence…

Doing it properly?

When giving a presentation last year about how much a DBA should do to get to the bottom of a problem; in a discussion between Martin Widlake, myself, and the audience we amusingly concluded that we probably shouldn’t be doing things quite right first time.

 

What do I mean by this? Well, we should be doing things right-enough. We don’t want anything to fail. We want the project you are working on to succeed. BUT, where is the mileage in doing everything right first time? NOBODY got commended by their management for implementing a really efficient IOT in the original schema design over and above a Heap. How many people will ooh and aah, and give you a really big bonus, over than nifty single table hash cluster that makes the system just Zing? Or the really, really clever statistics design. Nobody ever notices when the implementation is just right.

 

That said, even when you have plenty of influence over the design, spend ages getting it right, ensuring all the right bits are partitioned in the right way, the right services on the more appropriate nodes, seriously exceed the required number of transaction per seconds as they gave you a much bigger set of servers and SAN resources than you asked for, and then the Developers go and use an ORM like Hibernate and mess the whole thing up:

– Can you just change the query to work like this…. No.
– You query appears to be mixing ANSI-standard SQL with Oracle-syntax SQL. Can you please be consistent as this isn’t a good idea… No.
– You want me to write a trigger to audit when a new record has a different value in a column because you don’t know how to write a join, and you have no idea what a Windowing function is in SQL. Oh, you don’t actually understand SQL at all!

Some days it’s just not worth chewing through the straps.

SQL Plan Management – 12C dumb feature

In Oracle 11G, Oracle introduced SQL Plan Management (SPM). It is excellent (I love it to bits). It allows you to create Baselines against SQL which lock-down the SQL execution plan. No more plan flips. More consistency. Perfect**.

Whenever some Baselined SQL is ran, Oracle still parses it and compares the parsed output to the accepted (Evolved) baselines. If the newly parsed plan is better, a new baseline is added to DBA_SQL_PLAN_BASELINES but is NOT accepted. This means that you need to spend time manually accepting the baseline; running the command DBMS_SPM.EVOLVE_SQL_BASELINE plan and checking the new plan.

If you want it, and/or Oracle evaluates that is it a better plan for that particular set of bind variables, the plan is accepted and becomes a candidate to be used by future execution of your SQL. Complete control over your execution plans.

So, Oracle, what’s all this about in Oracle 12C, eh?

In Oracle 12C there’s a new SPM Evolve advisor task. “By default, SYS_AUTO_SPM_EVOLVE_TASK runs daily in the scheduled maintenance window” – So, it runs every night and by default it runs DBMS_SPM.EVOLVE_SQL_BASELINE for all new baselines created today and automatically accepts the new plans.

BY DEFAULT? NO! NO! NO!

That is precisely what I don’t want from baselines – Oracle making it’s own mind up about plans without any input from me. I’m using baselines to stop Oracle changing its mind. To explicitly limit the number of paths allowed by the Optimizer to ones I know about and with which I am comfortable. Don’t introduce functionality to do the opposite.

So, immediately following the installation of 12C, I would recommend running (you need to be SYS for this):

SELECT PARAMETER_NAME, PARAMETER_VALUE AS "VALUE"
FROM   DBA_ADVISOR_PARAMETERS
WHERE  TASK_NAME = 'SYS_AUTO_SPM_EVOLVE_TASK' AND
         PARAMETER_NAME in ('ACCEPT_PLANS','TIME_LIMIT')
ORDER BY 1;

PARAMETER_NAME            VALUE
------------------------- ----------
ACCEPT_PLANS              TRUE
TIME_LIMIT                3600

Then run:

BEGIN
  DBMS_SPM.SET_EVOLVE_TASK_PARAMETER('SYS_AUTO_SPM_EVOLVE_TASK',
    'ACCEPT_PLANS', 'false');
END;
/

OK, back where we were, with any baselines fixed in place and doing what I want them to do! Not change.

 

**Perfect? No. But Baselines are good and, as long as your DB structure does not change, they should keep working. If they don’t, raise an SR with Oracle as it’s probably a bug.

UPDATE 2015-11-25: This is still as true in 12.1.0.2.5 as it was in 12.0 Grrrr!!!

Oracle’s Locking Model – Multi Version Concurrency Control

mvcc

SQL Developer insight

When using SQL Developer, there is a feature called Completion Insight which help you write SQL and PL/SQL in the SQL Worksheets. It will auto-pop helpful things, like object name when typing a SELECT statement, column names if it is aware of the object which you are accessing and things like that.

The pop-up seems to pop-up inconsistently, appearing when I don’t want it and not appearing when I do. I finally snapped and decided to work out (look up) exactly how this feature works. And it’s like this.

1. Make sure your SQL is correctly formed. If you have multiple bits of SQL in the Worksheet, and they are not terminated with a “;” or “/“, then it can get confused about where you are contextually within a statement and not pop-up.

2. The Auto-popup has a delay of either 0.3s (SQL) or 0.6s (PL/SQL). This can be changed in Tools -> Preferences -> Code Editor -> Completion Insight. I have now set a 1.0s delay so it’s not quite so eager to appear.

3. You want the pop-up to appear? Press CTRL + SPACE. If it still does not appear, then it cannot determine your SQL Context (see point 1) or there’s nothing to pop.

Hope this helps you. It’s certainly helped me.

Oracle Join Syntax for ANSI people

I was talking with a non-Oracle DBA the other day about Oracle SQL syntax for joins, as I had written some SQL which he didn’t understand, using the old form of Oracle notation to signify outer joins – the (+) operator. It got me thinking; I have been using Oracle for a very long time, before ANSI Join syntax was allowed. I therefore tend to use the Oracle-specific format for joins as my brain it wired to simply understand the notation. However, I understand the standard ANSI syntax as I do use and administer other RDBMS systems (maily SQL Server, but I am also familiar with the *= syntax used by Sybase.)

I was wondering, which syntax do you use? If you use the ANSI standard joins with Oracle, have you come across any drawbacks or problems; I seem to recall some anecdotes but can’t put my finger on any?

Examples of the Syntax:

--Inner Join ANSI
select def.username,usr.username,usr.account_status
from dba_users_with_defpwd def join dba_users usr on (def.username = usr.username)
order by 2
--Inter Join Oracle
select def.username,usr.username,usr.account_status
from dba_users_with_defpwd def, dba_users usr
where def.username = usr.username
order by 2

--Right outer join ANSI
select def.username,usr.username,usr.account_status
from dba_users_with_defpwd def right outer join dba_users usr on (def.username = usr.username)
order by 2

-- Right outer join oracle
-- put the (+) where you want to say "match regardless" [on the "wrong" side]
select def.username,usr.username,usr.account_status
from dba_users_with_defpwd def, dba_users usr
where def.username(+) = usr.username
order by 2