SCAN listener logs – where are they?

These are a bit of a pain as their location isn’t intuitive. So to remind me where they are:

Login to the server(s) as the grid owner and check the scan listener status. This will show you the location of the listener log. cd to just below the diag directory and you’re off!:

server-name:/u01/grid>ps -ef | grep SCAN
 grid 8542 8282 0 10:20 pts/0 00:00:00 grep SCAN
 grid 9349 1 0 Mar07 ? 00:07:33 /u01/app/11g/grid/bin/tnslsnr LISTENER_SCAN1 -inherit

server-name:/u01/grid>lsnrctl status LISTENER_SCAN1
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 28-MAY-2014 10:20:12
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
 STATUS of the LISTENER
 ------------------------
 Alias LISTENER_SCAN1
 Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production
 Start Date 07-MAR-2014 17:27:50
 Uptime 81 days 15 hr. 52 min. 21 sec
 Trace Level off
 Security ON: Local OS Authentication
 SNMP OFF
 Listener Parameter File /u01/app/11g/grid/network/admin/listener.ora
 Listener Log File /u01/app/11g/grid/log/diag/tnslsnr/server-name/listener_scan1/alert/log.xml
 Listening Endpoints Summary...
 (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.6.148.141)(PORT=1521)))
 Services Summary...
 Service "FRONT_APP_DB_SVC.WORLD" has 2 instance(s).
 Instance "ORCL1", status READY, has 1 handler(s) for this service...
 Instance "ORCL2", status READY, has 1 handler(s) for this service...
 Service "ORCL.WORLD" has 2 instance(s).
 Instance "ORCL1", status READY, has 1 handler(s) for this service...
 Instance "ORCL2", status READY, has 1 handler(s) for this service...
 Service "ORCLXDB.WORLD" has 2 instance(s).
 Instance "ORCL1", status READY, has 1 handler(s) for this service...
 Instance "ORCL2", status READY, has 1 handler(s) for this service...
 The command completed successfully

server-name:/u01/app/11g/grid/log>cd /u01/app/11g/grid/log
server-name:/u01/app/11g/grid/log>adrci
ADRCI: Release 11.2.0.4.0 - Production on Wed May 28 10:22:47 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/11g/grid/log"
 adrci> show homes
 ADR Homes:
 diag/asmcmd/user_grid/server-name
 diag/tnslsnr/server-name/listener_scan3
 diag/tnslsnr/server-name/listener_scan2
 diag/tnslsnr/server-name/listener_scan1
 diag/asmtool/user_root/host_3797755080_80
adrci> show alert
Choose the alert log from the following homes to view:
1: diag/asmcmd/user_grid/server-name
 2: diag/tnslsnr/server-name/listener_scan3
 3: diag/tnslsnr/server-name/listener_scan2
 4: diag/tnslsnr/server-name/listener_scan1
 5: diag/asmtool/user_root/host_3797755080_80
 Q: to quit
Please select option: 4

and there we are…
remember to be on the correct host for each scan,
otherwise the alert (listener) log file will be out of date.

 

The trouble with Timezones & Grid Infrastructure

When installing Oracle Grid Infrastructure 11.2 (and all other releases), you need to make sure that you have all of the server setting correct and to standard before you do the install. One that bit me recently was the timezone setting. The Red Hat 6.4 server(s) in question has the correct file in /etc/localtime (copied from /usr/share/zoneinfo/whatever). If I type in date, I get the reply in the correct timezone (GMT/BST as I’m in London), so all seems correct.

However, the slack Unix Sysadmin (which might or might not have been me) had not put the correct setting in /etc/sysconfig/clock. Unfortunately, when you install Grid Infrastructure, the setting is read from /etc/sysconfig/clock and embedded into a Grid Inforastructure config file. $GRID_HOME/crs/install/s_crsconfig_hostname_env.txt


### This file can be used to modify the NLS_LANG environment variable, which determines the charset to be used for messages.

### For example, a new charset can be configured by setting NLS_LANG=JAPANESE_JAPAN.UTF8

### Do not modify this file except to change NLS_LANG, or under the direction of Oracle Support Services

TZ=Europe/London

NLS_LANG=AMERICAN_AMERICA.AL32UTF8

TNS_ADMIN=

ORACLE_BASE=


If you change this entry, and you should check with Oracle Support if this is OK for your site, and you will need to restart Grid Infrastructure. The one thing about this that I really don’t like is that Oracle is storing a runtime configuration file in a an install directory. Does it do that anywhere else?

Online index problem – ORA-08104

So, you’re creating (or rebuilding) an index ONLINE on a busy system. Your session dies, or it becomes necessary to kill the command, you may find that Oracle does not (always manage to) automatically clean up after itself.

CREATE INDEX my_ind ON my_table (mycol ASC)
 LOCAL LOGGING COMPRESS 1 ONLINE;

(ctrl-c)
ORA-01013: user requested cancel of current operation

select * from user_indexes where index_name = 'my_ind';

INDEX_NAME INDEX_TYPE
my_ind     NORMAL

OMG! WTF! TLA’s! The index is there, even though I cancelled the create statement! Lets drop it…

 
drop index my_ind;
 *
 ERROR at line 1:
 ORA-08104: this index object 79722 is being online built or rebuilt

So, HOW do I sort out this mess? Use DBMS_REPAIR!

  
  1  declare
  2  lv_ret BOOLEAN;
  3  begin
  4  lv_ret := dbms_repair.online_index_clean(79722);
  5* end;

select * from user_indexes where index_name = 'ind_name';

no rows selected

Bang! and the index (or, rather, left-over temporary extents from the build attempt) is gone, ready for you to try again.

Goldengate OGG-01223 (Version) Problem

Just implementing Goldengate between a platform I don’t understand, a Tandem/HP non-stop, and Oracle 11G R2 RAC. So, I spend the day trying to get it working, have all of the configuration seemingly correct and when the Tandem guy an I try to for a connection to send data over, we keep getting the following warning (and no data):

WARNING OGG-01223  Oracle GoldenGate Collector for Oracle:  Could not find definition for <Tandem-table-thing>

So, after extensive tweaks, mods, changes, banging heads off desks and general disbelief, I spend a while Googling for (or, more accurately, duckduckgo-ing – I’m sick of being the commodity), and on MOS, and noticed a problem between Version 11.2 of GG and 11.1. It was in no way related to the problem we had, but it prompted me into a though; exactly which versions of GG are on each platform? There was a minor discrepancy of versions.

OK – download GG 11.1.1.0.0 for Linux to match the Tandem version. Install. Replay the Configuration. Run. Works. Simples.

Lesson: Check your versions, check your compatibility. Where possible, keep them absolutely perfectly aligned and you might not waste 1/2 day troubleshooting. Grrrr.

Oracle Data Files

I was looking at the contents of a tablespace recently – just the extents, not a block dump – with the intention of shrinking a couple of the associated datafiles (don’t ask why – the reason is both stupid and irrelevant). I needed to know just how much I could shrink immediately, and if there were any quick wins in terms of moving objects from near the end of files so even more space could be reclaimed. I was a little surprised to discover that something which I have been doing for over 20 years was not as common knowledge as I thought it was, so I though I would try to let some more people know how to do this.

So, what’s in a tablespace? It’s made up of a number of datafiles (whether in ASM, RAW or on a journalled file-system is irrelevant for this conversation). Within the datafiles are extents. These are logical groupings of blocks – a multiple of the block size – which show where your objects are stored within the datafile.

I wrote this piece of SQL to show where the extents live in 1992. It still works today, regardless of whether your tablespaces are dictionary or locally managed:

 
SELECT tablespace_name, file_id, owner, segment_name, block_id begin,
blocks, block_id+blocks-1 end, bytes/1024 KB, '' free
FROM sys.dba_extents
where tablespace_name = 'USERS'
UNION
SELECT tablespace_name, file_id, '' owner, '' segment_name, block_id begin,
blocks, block_id+blocks-1 end, bytes/1024 KB, 'FREE' free 
FROM sys.dba_free_space 
where tablespace_name = 'USERS' 
ORDER BY 1,2,5

TSPC ID OWNER  SEGMENT_NAME    BEGIN  BLOCKS   END     KB  FREE
USERS 4 SCOTT  DEPT              128       8   135     64
USERS 4 SCOTT  PK_DEPT           136       8   143     64
USERS 4 SCOTT  EMP               144       8   151     64
USERS 4 SCOTT  PK_EMP            152       8   159     64
USERS 4 SCOTT  SALGRADE          160       8   167     64
USERS 4                          168     128   295   1024 FREE
USERS 4 SYS    TABLE_1           296       8   303     64
USERS 4 SYS    TABLE_1           304       8   311     64
USERS 4 SYS    TABLE_1           312       8   319     64
USERS 4 SYS    TABLE_1           320       8   327     64
USERS 4 SYS    TABLE_1           328       8   335     64
USERS 4 SYS    IND_1             336       8   343     64
USERS 4                          344      40   383    320 FREE
USERS 4                          384     128   511   1024 FREE
USERS 4                          512     128   639   1024 FREE
USERS 6 SYS    TABLE_2           128     128   255   1024
USERS 6                          256   12544 12799 100352 FREE
USERS 6 SYS    IND_2           12800     256 13055   2048

So, as we can see from the output above, the tablespace USERS has 2 datafiles; “4” and “6” (you can identify them fully using view DBA_DATA_FILES). We can shrink file 4 immediately by 1024K+1024K+320K = a bit over 2M.

File 6 is a bit more of a problem. There’s an index IND_2 in the way, unhelpfully located at the end of the datafile, which stops us from shrinking the datafile. If you try to resize the file using ALTER DATABASE DATAFIL E ‘+DATA/orcl/…..’ 10M, it will fail with the error ORA-03297: file contains used data beyond requested RESIZE value. In this case, the remedy may be as simple as performing an ALTER INDEX IND_2 REBUILD TABLESPACE USERS ONLINE; This will (probably) relocate the index somewhere lower in the file and allow us to shrink the datafile. Moving tables with an ALTER TABLE … MOVE command may prove more difficult depending upon your throughput due to the extra locks needed at the start and end of the transaction, and extra care needs to be take with any LOB objects involved.

Problem solving with DBMS_DATAPUMP

Yesterday I had a problem… I noticed that one of our overnight jobs was failing. The job in question is an ad-hoc tidy-up job, ran as needed when a performance metric for a specific query indicates that we’re going to be in trouble soon.

So, what was the problem? I have the stack trace captured in the job output:

Finished shrinks at 02-APR-2012 00:12
 Creating job data pump export job LOG_EXPORT
 User:....... NEW_SCHEMA
 Tables:..... 'table-1','table-2'
 DECLARE
 *
 ERROR at line 1:
 ORA-31626: job does not exist
 ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
 ORA-06512: at "SYS.DBMS_DATAPUMP", line 911
 ORA-06512: at "SYS.DBMS_DATAPUMP", line 4356
 ORA-06512: at line 22



So, “job does not exist”! What does that mean, and what could be causing this to happen? What has changed? Let’s look at the code at line 22 and see what we’re running:

.
.
v_handle := DBMS_DATAPUMP.OPEN ( operation => 'EXPORT', 
                                 job_mode  => 'SCHEMA', 
                                 job_name  => 'LOG_EXPORT');
.
.

And that’s it. We’re initiating a DBMS_DATAPUMP job (to extract the data to an external file before manipulation.)

So, what has changed? Since the last invokation of this job, we have had a new software release, and for this product that means a whole new schema. We’ve missed something. We must have a permissions problem within the database. Fortunately, we keep the old schema in place, locked, following an upgrade so all we need to do is compare the permissions for the old and new schemas and we’re done.

select granted_role from dba_role_privs where grantee = 'OLD_SCHEMA'
minus
select granted_role from dba_role_privs where grantee = 'NEW_SCHEMA'
/

no rows selected

select privilege from dba_sys_privs where grantee = 'OLD_SCHEMA'
minus
select privilege from dba_sys_privs where grantee = 'NEW_SCHEMA'
/

no rows selected

select table_name,privilege from dba_tab_privs where grantee = 'OLD_SCHEMA'
minus
select table_name,privilege from dba_tab_privs where grantee = 'NEW_SCHEMA'
/

TABLE_NAME                     PRIVILEGE
------------------------------ ----------------------------------------
DATA_PUMP_DIR                  READ
DATA_PUMP_DIR                  WRITE



Ah ha! There’s the difference! And it is definitely data pump related, so let’s run the grants into the database and run the job again. This will fix it.

grant READ  on DIRECTORY data_pump_dir to NEW_SCHEMA;
grant WRITE on DIRECTORY data_pump_dir to NEW_SCHEMA;

... and run the job again...

 DECLARE
 *
 ERROR at line 1:
 ORA-31626: job does not exist
 ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
 ORA-06512: at "SYS.DBMS_DATAPUMP", line 911
 ORA-06512: at "SYS.DBMS_DATAPUMP", line 4356
 ORA-06512: at line 22

What! Oh nuts. Why didn’t that work? [because you have not assigned a directory to DBMS_DATAPUMP yet! It’ll be useful later though… Now stop jumping to conclusions! ed.] What are we going to do now? We know the permissions match. I’ve checked the O/S and there’s space in the filesystem. DBMS_DATAPUMP is not giving me enough information; what can I do to diagnose this problem further? I know! Why don’t I use datapump externally and see if that makes a difference…

expdp NEW_SCHEMA/..pwd..@..DB.. 
 directory=data_pump_dir 
 dumpfile=neilchandler.dmp 
 job_name=neilchandler

Export: Release 10.2.0.3.0 - 64bit Production on Tuesday, 03 April, 2012 12:54:27
Copyright (c) 2003, 2005, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP and Data Mining Scoring Engine options
ORA-31626: job does not exist
ORA-31633: unable to create master table "NEW_SCHEMA.NEILCHANDLER"
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPV$FT", line 863
ORA-01536: space quota exceeded for tablespace 'USERS'



Oh heck, we can’t write the parameter table as we have been denied a space quota on the default tablespace. D’oh! How painful was that to locate?

alter user NEW_SCHEMA quota unlimited on USERS;



And it all worked.

My question is, why couldn’t DBMS_DATAPUMP tell me that?

SAN Migration: When modern SANs “Fail”

… or at least, when they don’t behave like you were expecting them to.

I recently performed a major migration of a dozen missions critical Oracle and SQL Server systems from a pair of old EMC CX700’s to 2 brand new shiny HP XP 20000’s. This blog post is intended to show, at a very high level, what steps were taken to ensure the new SAN could cope with the volume of traffic. It also shows that, despite the new SAN being much more modern and intelligent, how you can still get caught out by unexpected behaviour.

The databases to be migrated varied from high throughput OLTP, low throughput – fast response time OLTP, a small Data Warehouse and Grid Control. The systems are 24 x 7 x 365, but a nightly maintenance window can be arranged on some days of the week between midnight and 3am. Not a large window, but invaluable.

What are we migrating?

For a successful SAN migration, we first need to understand what we are migrating. To do this we need to profile the I/O requirement of each Oracle database. Oracle provides us with all of the tools we need to be able to do this. If you have the Tuning and Diagnostics Pack licensed (which all Oracle sites should have – it should really be part of base product.) then you can pull the information you need directly out of the AWR table DBA_HIST_SYSTEM_EVENT. By default your AWR stats gather every hour and only last a week. Hourly averages of system event wait times will give a good starting point, but I would not want to use them to determine peak volumes. If you capacity plan to your averages, you will run out of resources before you hit your peaks. A lot can happen in an hour when you are looking at millisecond response times. You need to be much more granular than an hourly average, but you might not want change AWR to gather all of its information as frequently as you need for the I/O events. You need to develop some code to gather the information from v$system_event directly, so I did [I will post this code up shortly]. You can be as granular as you want, but I balanced peaks against volume of information and the impact of gathering it and was running every 5 minutes. Mostly. I also took some much more frequent samples to ensure that I wasn’t wildly wrong with the 5 minute averages, and that I wasn’t experiencing very short peaky loads. This method also has the advantage of not needing any additional pack licensing from Oracle. Sorry Larry.

What are the resources available on the new SAN?

The second requirement for a SAN migration is to understand the capabilities of your new SAN. There were already some systems on the new SAN so I was unable to test to destruction. However, some dedicated resources were available on the shared SAN, namely dedicated disks, cache and front-end ports. I needed to throw substantial workloads at the SAN to stress the resources and find the tipping point.  There are many workload tools available, such as Hammerora and RAT. Some are expensive, such as RAT. However, for my purposes I was only looking to stress the SAN and get a comfortable feeling about the IOPS and throughput performance. This was achieved using Oracle’s free SAN stress tool, Orion [I will hopefully post more details about this shortly]. By taking all of the profiled I/O rates, adding them up and comparing against the new SAN and it’s much greater cache capacity (16GB dedicated to my systems, up from 4GB on the EMC’s), we were able to see that the footprint of the databases to be migrated fitted comfortably into the capabilities of the new SAN. And so the migrations began.

Thin Provisioning

To provide resilience for the databases, LUNs were presented from each SAN to each cluster node, and were mirrored at the Operating System level, either using native mirroring on the Unix platforms, or Veritas on the Windows platforms. This allows either SAN (or SAN location) to fail with no loss of service. The LUNs were either RAID-10 or RAID-5, and were carved out of the storage pools using Thin Provisioning.

Thin Provisioning (or using Meta-Luns or similar techniques from your SAN provider) is a way to allow LUN’s to be easily carved out of storage pools and extended at a later date. They are wonderful for SAN administrators and people who have not capacity planned their systems due to their dynamic nature and minimization of space wastage. However, they work on the basis of providing a bit of storage to your LUN from every disk in the array. This gives a maximum concurrency, maximum contention model. Every LUN is on the same disk competing for the same resource. Lots of IOPS, but everyone wants them. The Thin Provisioning also adds a small overhead to the I/O processing. Not much. Not even a whole milli-second.

Migration Problem

The migrations progressed well, from least critical to most critical system. As each system was migrated, we kept monitoring and measuring the I/O response times. Migrating the Data Warehouse showed that the new SAN was performing as expected. Processing times were lower. Not much, but in line with expectation and allocated resources. However, Grid Control was exhibiting some strange response times. Despite theoretically having more resources, some I/O response time metrics were worse on average, and much more variable than before. With the EMC SAN’s, we  had experienced a good, consistent level of performance for each LUN. The XP’s were proving to be more erratic.

I did a lot of investigation, moving resources around (e.g. dedicated front-end ports for the Grid Control LUN’s.) and measuring the affect. The level of I/O associated with Grid Control, with only a couple of hundred servers under management, was substantially lower than that of systems competing for shared resources (e.g. the Data Warehouse, which was exponentially more demanding). It seemed that Grid Control’s LUNs were not able to survive in the SAN cache. The old EMC cx700’s were relatively “dumb” compared to the new XP20000’s, and had effectively dedicated an amount of  cache to each LUN. The more intelligent XP’s would not be so potentially wasteful of resources, and so deemed the (relatively) low level of activity from the Grid Controls LUN’s to be unimportant and aged the blocks out.

We could live with this for Grid Control. Blisteringly fast response time is not a critical factor in the success of Grid Control. The same could not be said for one of our mission critical applications. It requires a very low end-to-end response time for each transaction. We carefully measure each step – client, calculation, render, network latency, database response time, etc. It needs to be fast and consistent. From the database, it really needs to hit cache or SSD’s. Spinning rust (ordinary disk) is simply too slow. However, the level of activity for this system, in terms of MB/s and IOPS, is relatively low compared to all of the other systems competing for SAN resources. About 3% of the whole. Add to this the overhead of Thin Provisioning and we have a problem. When a 2ms average turns into a nearly 4ms average with much higher peaks, as it did for db_file_sequential_read on Grid Control, that would be a major problem for this application.

Solution

Talking to HP, they would only guarantee the response time we demand by using SSD’s and a cost that made my eyes spin and think of buying a Supercar instead. A more practical (i.e. cheaper) solution had to be found, and it was (with the help amd support of HP, just no guarantees 🙂 ). We bought some more conventional disk and configured it into a traditional Raid-10 storage pool. No Thin Provisioning. We then partitioned the 16GB SAN cache into a 4GB cache dedicated to this system, and left the 12GB cache behind for everything else. We were migrating from 4GB, and understood that we had plenty of wriggle room in this configuration.

And the result? After migration, the performance of the mission critical low throughput system improved substantially, with consistent log_file_parallel_write times of less than 2ms and db_file_sequential_read times of less than 1ms, better than we were experiencing on the old reliable EMC’s. I mean, less than 1ms? That’s fast.

I/O Response Times – Graph

%d bloggers like this: