UKOUG 2-Stream London SIG

ukoug_logoWe now have the agenda online for the UKOUG 2-stream London SIG, due to take place on 23rd February. There will be an RDBMS Stream and a RAC, Cloud, Infrastructure and Availability Stream.

Mark Rittman will be giving us a keynote talk, with an emphasis on kettles or eating dinner in the dark or the Hadoop cluster in his garage or, well, probably something IoT-related

There are some really cool other speakers too, from Oracle, Leading Consultancies and the odd Independent too (like me).

The full Agenda and Registration Info can be seen here:
https://www.ukoug.org/events/ukoug-database-and-rac-cia-joint-sig-23022017/

It would be lovely to see you there, and maybe for a beer afterwards too.

Call for Papers – UKOUG 2017 SIGs

All,

The time has come for me to plan the upcoming 2017 UKOUG Special Interest Groups.

I am chairman of the RAC, Cloud, Infrastructure and Availability (RAC-CIA) SIG and I’m after presentations for the 3 joint SIGs we will be putting on with the RDBMS SIG, plus the stand-alone SIG will we will having in the autumn.

A SIG is a single day one-or-two stream conference which we take around the UK to make it as accessible for UKOUG members and non-members alike (NOTE: there is a charge for non-members)

This year we are looking at running:

Thursday 23rd February – RDBMS + RAC, London 2 streams
Thursday 27th April – TBC: Northern Day – RDBMS + RAC + ? , Manchester (probably)
Thursday 15th June – 12.2 Themed event – TVP Reading – 2 stream event RDBMS + RAC
Wednesday 27th September – RAC-CIA SIG, London 1 stream

If you are interested in speaking, you can submit an abstract here:

http://ukoug.formstack.com/forms/submit_an_abstract

or you can email me directly : neil@chandler.uk.com

Be quick! The next event is in 10 weeks time…

Oracle Audit Control part 3 – OS files

Just realized that I wrote a follow-up to Part 2 two years ago and never posted it! So here’s the stripped-out key facts to round off auditing control a bit more:

Deleting O/S audit files is, like, totally necessary. In Oracle 12C (12.1), they are created by default as AUDIT_SYS_OPERATIONS defaults to TRUE now, so you are going to be filling up your AUDIT_FILE_DEST whether you realized it or not.

[NOTE: In theory, this could end up being a performance problem as well as a space problem if you have many millions files in the AUDIT_FILE_DEST. In Oracle 11G, when allocating an AUDIT file, it would check to see if a file for that SPID already existed and add an incremental number, The act of calling vx_dirbread and vx_dirscan can get terribly slow if there are lots of files to wade through. This mechanism is different in 12C but I have not traced the kernel calls to see if the vx_ calls have gone away.]

To cleanup the O/S, we use the same commands as we did in Part 2, but changed to use an AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_OS

You also need to be aware of the RAC_INSTANCE_NUMBER. For database-level audit, it’s irrelevant. For single instance, it is also irrelevant, but with RAC, this parameter tells you which node to clean up as they are all individual installs with their own audit files (assuming a Grid Infrastructure install per node.)

Commands:

-- Show the BEFORE sizes
host ssh server01 du -m -s /u01/app/oracle/admin/ORCL/a*
host ssh server02 du -m -s /u01/app/oracle/admin/ORCL/a*
begin

-- Loop around every instance in the RAC cluster and run the command
for i in (select inst_id from gv$instance)
loop

-- cleanup AUDIT_FILE_DEST (operating system audit files)
-- set cleanup for 60 days
DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP(
AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_OS,
LAST_ARCHIVE_TIME => systimestamp-60,
RAC_INSTANCE_NUMBER => i.inst_id );

-- And cleanup based upon that date
DBMS_AUDIT_MGMT.clean_audit_trail(
audit_trail_type        => DBMS_AUDIT_MGMT.AUDIT_TRAIL_OS,
use_last_arch_timestamp => TRUE);

end loop;

end;
/

-- And check the sizes AFTER
host ssh server01 du -m -s /u01/app/oracle/admin/ORCL/a*
host ssh server02 du -m -s /u01/app/oracle/admin/ORCL/a*

 


 

Output:

15:58:56 SYS @ ORCL1 > host ssh server01 du -m -s /u01/app/oracle/admin/ORCL/a*
776 /u01/app/oracle/admin/ORCL/adump
15:58:56 SYS @ ORCL1 > host ssh server02 du -m -s /u01/app/oracle/admin/ORCL/a*
694 /u01/app/oracle/admin/ORCL/adump
15:58:56 SYS @ ORCL1 >
15:58:56 SYS @ ORCL1 >
15:58:56 SYS @ ORCL1 > begin
15:58:56   2
15:58:56   3   for i in (select inst_id from gv$instance)
15:58:56   4   loop
15:58:56   5
15:58:56   6      -- cleanup AUDIT_FILE_DEST (operating system audit files)
15:58:56   7      -- set cleanup for 60 days
15:58:56   8      DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP(
15:58:56   9      AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_OS,
15:58:56  10      LAST_ARCHIVE_TIME => systimestamp-50,
15:58:56  11      RAC_INSTANCE_NUMBER => i.inst_id );
15:58:56  12
15:58:56  13      -- And cleanup based upon that date
15:58:56  14      DBMS_AUDIT_MGMT.clean_audit_trail(
15:58:56  15        audit_trail_type        => DBMS_AUDIT_MGMT.AUDIT_TRAIL_OS,
15:58:56  16        use_last_arch_timestamp => TRUE);
15:58:56  17
15:58:56  18   end loop;
15:58:56  19
15:58:56  20  end;
15:58:56  21  /
PL/SQL procedure successfully completed.

15:58:56 SYS @ ORCL1 >
15:58:56 SYS @ ORCL1 >
15:58:56 SYS @ ORCL1 > host ssh server01 du -m -s /u01/app/oracle/admin/ORCL/a*
354 /u01/app/oracle/admin/ORCL/adump
15:58:57 SYS @ ORCL1 > host ssh server02 du -m -s /u01/app/oracle/admin/ORCL/a*
364 /u01/app/oracle/admin/ORCL/adump
15:58:57 SYS @ ORCL1 >
 


 

 
And just one final bit. If you change the default

AUDIT_TRAIL=DB to
AUDIT_TRAIL=DB, EXTENDED

the audit writes the first 2,000 characters of the SQL command to DBA_AUDIT_TRAIL.SQL_TEXT, so you get the full text of the SQL command audited, rather than just the action. Very handy!

UKOUG RDBMS and RAC-CIA Special Interest Groups

On Thursday 21st April, there is a dual UKOUG Database and RAC, Cloud, Infrastructure and Availability special interest group.

For the first time, this event is being held in the fabulous Northern city of Manchester!

There are a dozen interesting, career-assisting, educational talks from end users, Oracle employees and a number of well known Oracle ACE’s at all levels, including Carl Dudley, Jonathan Lewis, Phil Brown and myself.

I will be talking about how to troubleshoot Goldengate, showing optimal configurations to assist with problem determination and a bit of staring at Hex dumps for the brave.

There are only a few places left for this popular dual-stream event. Click Here for more details about the talks and speakers, and details for registration.

See you there!

Sequences – a quick guide and an unexpected quirk

Sequences on Oracle databases are simple, but there are some basic truths about sequences which need to be understood if you are going to use them, especially on RAC:

  1. Sequences will have “holes” in them, even if you specify NOCACHE (e.g. if you rollback your transaction).
  2. Don’t specify NOCACHE. It doesn’t scale.
  3. Sequences will NOT necessarily be, erm, sequential for your transactions. That will depend upon the COMMIT order and if you are using RAC.
  4. The higher the CACHE value, the bigger the holes but the more scalable the sequence will be become. You will lose the entire set of unused cache values across instance restart.
  5. Oracle stores the CACHE value on EACH RAC node, so for a cache of 100,000, NODE 1 gets 1-100,000, NODE 2 get 100,001-200,000 to play with, and so on.
  6. If you are using RAC, and you have a heavy workload on one node and a low number of transactions on another node, there may be a significant disparity between the NEXTVAL on each node. I have seen this disparity manifest in the tens of millions.
  7. There is a global cache lock which must be passed from node to node when getting a new cache of sequence numbers on that node. It is lightweight, but it is a serialisation point. If your CACHE value is too low for your throughput, you will get contention on this lock.
  8. Same as the serialisation point if you specify ORDERED, so you get the sequences advancing in step (rather than each node having its own cache) on multiple RAC nodes. Don’t do this either.
  9. The point if sequences is to give you a unique value for a (surrogate) Primary Key. You can rely upon sequences to be unique. Well. Mostly. As long as they do not CYCLE.
  10. If you use sequences for a PK, you will probably end up with a right-hand index with hot-block contention during inserts, causing you to implement something suboptimal, like hash partitioning or reverse-key indexes to get around your design flaw.

Sequences on RAC can behave in unexpected ways. The below example is a little contrived but shows a potentially unexpected behaviour caused by CYCLEing your sequences:

NODE 1 NODE 2
create sequence seq_1
start     with 1
increment by   1
maxvalue       10
cache          2
cycle;

Sequence created.

NODE1 :select seq_1.nextval from dual;
NEXTVAL: 1
NODE2 :select seq_1.nextval from dual;
NEXTVAL
3 <- Gets next value above cache on node 1
NODE1 : /
NEXTVAL: 2
NODE1 : /
NEXTVAL: 5 (skipping 3 and 4 on Node 2)
6
7
8
9
10
1
2
3  <- erm. Hang on. Node 2 has this!
4  <- and this!!!
5
6
7
8
9
10
1
NODE2 : /
NEXTVAL
4 <- Reading from the local cache
NODE2 : /
NEXTVAL
3 <- And get a new cache value

So, if you are using sequences and you can’t guarantee sequence and you can’t guarantee no gaps and, as this demo shows, you can’t guarantee uniqueness, and they cause hot block right hand index problems, tell me again why you are using them?

Because they are fast and you don’t CYCLE them.

However, sequences are becoming the new evil in the database, right behind triggers.

Extending an ACFS filesystem dynamically.

To extend an ACFS cluster filesystem dynamically, we need to use the acfsutil command:

node01:/u01/grid>/sbin/acfsutil size +10G /u02
acfsutil size: ACFS-03008: The volume could not be resized.  The volume expansion limit has been reached.
acfsutil size: ACFS-03216: The ADVM compatibility attribute for the diskgroup was below the required
                           version (11.2.0.4.0) for unlimited volume expansions.

Oh dear, not 11.0.2.4, so you can only extend volumes dynamically a few times (5) before the global bitmap becomes full. So, now it’s an offline change. 😦

Check what is accessing /u02 and stop it:

node01:/opt/oracle>sudo -s
[root@node01 oracle]# lsof /u02

COMMAND   PID   USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
bash     5566 oracle  cwd    DIR 252,50177    12288   78 /u02/goldengate/bin11
su      29509   root  cwd    DIR 252,50177    12288   78 /u02/goldengate/bin11

erm. kill -9 5566 29509 :-)

DO THIS ON EVERY RAC NODE!

[root@node01 oracle]# umount –t acfs /u02
[root@node02 oracle]# umount –t acfs /u02
[root@node-n oracle]# umount –t acfs /u02

Once unmounted, we can “repair” the global bitmap:

[root@node02 oracle]# fsck -y -t acfs  /dev/asm/acfsdisk_u02-98
fsck from util-linux-ng 2.17.2
version                   = 11.2.0.4.0
*****************************
********** Pass: 1 **********
*****************************
Oracle ASM Cluster File System (ACFS) On-Disk Structure Version: 39.0
 ACFS file system created at: Thu Jan  2 17:08:02 2014
 checking primary file system
 Files checked in primary file system: 25%
 Files checked in primary file system: 100%

 fsck.acfs: ACFS-07728: The Global_BitMap file has reached the maximum number of extents (5).
 The file system can no longer be expanded. 

 Running fsck.acfs in fixer mode will attempt to consolidate the storage bitmap into 
 fewer extents which would allow for file system expansion

 Checking if any files are orphaned...
 0 orphans found
 Checker completed with no errors.

So lets fix it – output seriously trimmed, but with the important bit

[root@node02 oracle]# /sbin/fsck.acfs -a -v /dev/asm/acfsdisk_u02-98
fsck from util-linux-ng 2.17.2
version                   = 11.2.0.4.0
 *****************************
 ********** Pass: 1 **********
 *****************************
 Oracle ASM Cluster File System (ACFS) On-Disk Structure Version: 39.0
 ACFS file system created at: Thu Jan  2 17:08:02 2014

 checking primary file system
 Files checked in primary file system: 25%
 Files checked in primary file system: 100%

 fsck.acfs: ACFS-07729: The Global_Bitmap file has been
 consolidated into 2 extents.
 This may allow for file system expansion via the 'acfsutil size' command.
  
 Checking if any files are orphaned...
 0 orphans found
 Checker completed with no errors.

So, we’re done and can re-mount ON EVERY NODE. Given it’s now 2 extents, and the max we can have is 5, we have 3 more dynamic extensions before we need to do this again.

mount –t acfs /dev/asm/acfsdisk_u02-98 /u02

And re-attempt to expand the filesystem

node01:/u01/grid>df –h /u02
Filesystem              Size  Used Avail Use% Mounted on
/dev/asm/acfsdisk_u02-98  325G   36G  290G  12% /u02

node01:/u01/grid>/sbin/acfsutil size +10G /u02
acfsutil size: new file system size: 359703511040 (343040MB)
node01:/u01/grid>df -h

node01:/u01/grid>df -h /u02
Filesystem              Size  Used Avail Use% Mounted on
/dev/asm/acfsdisk_u02-98  335G   36G  300G  11% /u02
node01:/u01/grid>

Yey! Bigger filesystem! Lets minimise the amount of times it needs to be extended in the future but doing it in big lumps. Might just save a planned outage.

 

Alternatively, upgrade Grid Infra to at least 11.2.0.4 and set advm compatibility to 11.2.0.4 and the restriction will be gone for good:

ALTER DISKGROUP acfsdisk SET ATTRIBUTE 'compatible.asm' = '11.2.0.4', 'compatible.rdbms' = '11.2.0.4', 'compatible.advm' = '11.2.0.4';
(or ASMCMD [+] > setattr -G acfsdisk compatible.advm 11.2.0.4)
(or right-click on the disk group in asmca and click "edit attributes")

RACCheck

Running RAC? (Why? No, really, WHY?  Never heard of DataGuard? With a broker?)

Running RAC?
Not sure if you’ve configured it correctly?
Not sure if you have all of the recommended initialisation parameters set?
All recommended RPM’s installed?
All daemons running?
etc, etc, etc,

Well, as of Oracle 11.2.0.4 where’s a new feature provided by default called RACCheck. You can find it installed in directory $ORACLE_HOME/suptools/raccheck, (or you can download it from MOS article 1268927.1) and it’s called “raccheck”. With a little sudo configuration, or the root passwords, you can check the configuration on every node in a few minutes per node (run at a sensible time). All the basics appear to be covered, and you get a nice list of anomalies out of the system in HTML format.

I don’t necessarily agree with some of the errors/warnings produced (you might want the “problems” it’s finding!), but it gives you cause to re-think about an element of the system that may be configured in a non-standard way, and you get lots of relevant and useful links to MOS articles.

e.g. One problem: 

WARNING SQL Check Some user sessions lack proper failover mode (BASIC) and method (SELECT) All Databases

Can be happily ignored as I’m using a SCAN listener, which renders this WARNING irrelevant.

but I would recommend that you use the utility and accept/understand any exceptions. It should help stabilise any RAC installations you may have.

%d bloggers like this: