Running RAC in a Virtual Environment

I was in a discussion at an Oracle Meetup this week, and the subject of RAC in a virtualized environment – specifically Oracle Virtual Machine (OVM) – came up.
Here’s a couple of points which were discussed.

pingtarget

There was a lack of awareness of a common problem, which has a solution built-in to Oracle 12.1.0.2 Grid Infrastructure and later. In a virtualized environment, the network components are also virtualized. Sometimes, network failures on the host may not be raised up to the guests. As a result, the guest O/S commands can fail to detect the network failures and the Virtual NIC remains up. Grid Infrastructure (GI) will not perform a VIP fail-over as it can’t see the failure despite the network being unavailable.

To resolve this, Oracle has added an option of a “pingtarget” for each public network defined in GI. This will perform a keep-alive to a external device, usually something like the default gateway. This is just like the heartbeat on the cluster interconnect.

Before

srvctl config network

Network 1 exists

Subnet IPv4: 192.168.0.160/255.255.255.224/eth1, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

The default gateway makes a good ping target. For this IP and subnet, it’s 192.168.0.161

srvctl modify network -k 1 -pingtarget 192.168.0.161

After

srvctl config network

Network 1 exists

Subnet IPv4: 192.168.0.160/255.255.255.224/eth1, static
Subnet IPv6:
Ping Targets: 192.168.0.161
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

All safe!

Server Pools

A second item we discussed was the Server Pools in OVM. Each RAC guest should be on a different host, otherwise you have not eliminated that as a Single Point Of Failure. A second less obvious SPOF is the Server Pool disk.

A Server Pool is a filesystem LUN (and IP address prior to release 3.4) used to group a logical collection of servers with similar CPU models, within which we can create and migrate VM guests. For a RAC installation, each RAC node should be within a different server pool, as well as on different physical hardware.

ovs

In this image, RAC nodes within the same cluster should be created within each server pool. This configuration can safely support a 2 node cluster despite having 4 servers, with one node created in “OVS-Pool-2” on server “ovs02“. The second node should be in “OVS-Pool-1″ and can be on “ovs01“, “ovs11” or “ovs12“.

It is possible to live migrate guests between these 3 servers.

 

 

 

 

 

UKOUG Technology Day

northerntech17-com-600x110-rn-v2

We have organized a UKOUG Technology Summit in Leeds on 16th May.

It’s really a small Conference!

This is available to UKOUG Members and will cost 1 SIG Pass – even Bronze memberships come with 1 SIG pass, so all members could attend at no cost.
Non-members can pay to attend, but you should take out a Bronze Membership instead as it’s about the same price!

Come along, network with your peers, and listen to some amazing talks about all aspects of the Oracle technology product set; the Database, RAC, Engineered Systems, Development, Apex, DevOps, Solaris 11.4!, and more. Learn how Oracle interacts and works with Kubernetes, Docker, Apache Kafka, Apache Impala, Chatbots, and  just about everything else! 30 amazing talks!

Plus a keynote address from Oracle Executive John Abel and a social afterwards too!

All that, just 2 hours from London on the train!

You can find the Agenda here 

northerntech17-agenda

 

 

UKOUG Member Advocate: sorted!

I received an email from the UKOUG last week.

winners-picture

 

It turns out that I have won the vote to become the next Member Advocate for the UKOUG. I will (eventually) be joined on the board by Martin Widlake – who won the vote for the position of President Elect – a position whereby Martin will shadow the current UKOUG President before taking office on the Board. He has written about it here.

I would very much like to thank everyone who voted for me. I hope I can live up to your expectations. If you did not vote for me, that’s OK – I’ll represent you and your interests just the same as everybody else! Looks like I’m going to have to spend time getting up to speed with the Apps community, and re-acquainting myself with my old JDE friends.

Here’s to moving the User Group forward to meet the challenges of the changing face of our industry. See you all soon, and Cheers Everyone!

(sorry – would have posted this sooner but I was at the OUG Ireland User Group giving a couple of presentations and taking part in a Q&A panel about “18c and everything else!” – fantastic conference and lovely people there!)

 

 

UKOUG M.A. Voting Hots Up!

It’s getting hotter!

After many years volunteering at the UKOUG I am standing to be the Member Advocate at the UKOUG, representing the members on the board.

Do you think I will represent the membership well? Yes. VOTE FOR ME!

Oh, you must be the main UKOUG membership contact, or know who is, to vote.

giphy

One twist in the election is that Steve Davis has withdrawn from the vote due to business commitments, leaving a straightforward race between myself and Andy Nellis. That’s a shame as I think Steve would have made a good Member Advocate had he been voted in. Andy is unknown to me so I cannot comment on his suitability as a UKOUG volunteer or board member.

I’d really appreciate your vote – I know that many of you have voted for me already, but I want ALL of you to give me a strong mandate to improve the user group.

VOTE HERE

 

 

KILL! KILL! KILL! (of Unix processes)

The start of this isn’t my post – I got it from here: but I wanted to reblog/repost and enhance it because as far as I can tell, 99% of all known DBA’s only use kill -9 to remove unhappy processes.


Original Post:

Useless Use of Kill -9 form letter

No no no.  Don't use kill -9.

It doesn't give the process a chance to cleanly:

1) shut down socket connections

2) clean up temp files

3) inform its children that it is going away

4) reset its terminal characteristics

and so on and so on and so on.

Generally, send 15, and wait a second or two, and if that doesn't
work, send 2, and if that doesn't work, send 1.  If that doesn't,
REMOVE THE BINARY because the program is badly behaved!**

Don't use kill -9.  Don't bring out the combine harvester just to tidy
up the flower pot.

**don’t remove your Oracle or any other binaries please.


 

I hope you found that useful. I know I did. But what do the numbers mean? Well, they are increasingly violent ways to ask the program to stop itself. The command kill -9 isn’t asking the program to stop, it’s asking the O/S to stop running the program now, regardless of what it’s doing.

Run order of kills:

kill -15 : this is the equivalent of kill -sigterm and it the default. The program should terminate after it has finished what it is doing.

kill -2 : this is the equivalent of kill -sigint and is the same as pressing CTRL+C. This should mean “stop what you’re doing” — and it may or may not kill the program.

kill -1 : this is the equivalent of kill -sighup and tells the program that the user has disconnected. (e.g. SSH session or terminal window was closed). It usually results in a graceful shutdown of the program.

The executing program needs to be coded to recognise these kill signals, and all good software will spot them.

The other fun kill command is kill -sigstop. This can’t be blocked (like -9) as it’s an O/S level command too, but freezes the program execution like pressing CTRL+Z. You can continue the program execution later using kill -sigcont.

UKOUG Northern Technology Summit

northerntech17-com-600x110-cfp-v1

Here at the UKOUG, we’ve just announced that we will be holding a 1-day UKOUG Northern Technology Summit in the North of England on 18th April 2018.

It will be in Leeds, at the Park Plaza!

This is a 5-stream 1 day event, with around 30 talks! We’d call it a conference – it’s bigger than some European national conferences! – but we don’t want to have any confusion with our sexy flagship conference, UKOUG Tech17 (or, indeed Tech18 in Liverpool!)

The Call For Papers is now Open, and we are looking for talks in relation to Database, RAC, Cloud, Apex, Development and Engineered Systems. A full list can be found on the CfP page, linked here.

Tweet about it with the hashtag #UKOUG_ntech

HrOUG – The Croatian Conference

Arriving on Tuesday afternoon, directly from the Slovenia Conference and it’s straight into the 3rd Autumn conference! Conveniently, Slovenia and Croatia are timed together and are held within a little over 1 hours drive (plus a 15 minute boat ride!) apart. Well done on that one guys!IMG_1892

A quiet beer as many speakers and delegates wait at the docks for the boat to arrive, and I seem to be timing these sunsets well!IMG_1795

Wednesday was a day off, so Tuesday night allowed me to let my hair down a little and have a drink until late into the night. Wedneday, and the organizer, Davor, had arranged for all of the Oracle ACEs and Oakies to go on a tour of nearby Pula followed by some Go Karting. Great fun! I knew nothing of Pula so I was rather surprised to discover a huge Amphitheatre there!

This slideshow requires JavaScript.

The talks, on Thursday and Friday, were a repeat of the talks in Slovenia about Histograms and Plan Stability.  A great conference in a truely excellent location – having the conference on an island gives it the same magic as the OUGN conference: you can’t just wander off, we are all there together and that really adds to the conference.

IMG_1933

Rovinj with Andrew!

As a added bonus, one of my friends happened to be in the nearby town of Rovinj on the Friiday afternoon, which meant a serendipidous meet-up for lunch.

Overall a wonderful pair of conferences which felt as much a holiday as work, although I think I need a holiday to recover from them.

SiOUG – The Slovenian Conference

So October came around and it was time for my second and third conferences of the autumn season.  I had been kindly select to do 2 talks at this Slovenian User Group, but a last-minute problem with another speaker meant I found myself giving 3 talks in 2 days. Excellent fun!IMG_1728

 

I arrived in lovely Portoroz just as the sun was setting, with what seemed like most of Slovenia leaving the coast on the final hot Sunday of the year, just in time for a huge and wonderful dinner with all of the other speakers.

Day 1 was a whirlwind of talks, food and presenting  much of what I know about Histograms and Chocolate, washed down with a local craft ale – Thank You HumanFish Brewery! Yum!

A party at the end of day 1, with a generous consumption of wine and Craft Ale – all consumed with the lovely friendly and fun Slovenians at the conference – y’all know who you are! – meant that Day 2 held some trepidation…. would I wake up in time!

Day 2 saw me giving 2 more talks, about Optimizer Plan Stability, and also about DataGuard, in front of an interested and good sized audience before hopping into my car and driving down to Croatia to give 2 talks in a somewhat more leisurely 3 days. Tomorrow is a day off! Holiday!

High Five POUG: Polish Oracle User Group Conference?

I’ve just returned from presenting at the 2nd annual POUG Conference, this time in Krakow, Poland (the first was in Warsaw). This is the Polish Pint with Oracle User Group conference and is exactly what a conference should be. Educational, Informative, Well-executed and Fun.  It was based in a lovely city, with a wonderful friendly people, good food and great beer!

The Speakers gift was very thoughtful - a personalised t-shirt which my wife has declared too good to wear!

As you might imagine, with a name like Pint with Oracle User Group, there is beer to be drank. But in this case it was as available as the tea and coffee – and completely free! Several hundreds of litres of it were consumed over the course of the 2 day conference, a tiny little bit of it by me. I suppose this is the side effect of having a conference in a micro-brewery. Talk about organization skills!

IMG_1606

Brewery and Conference Location!

There were around 30 speakers giving talks, many of them well-known Oracle ACEs of various hues but a couple of first-timers who were excellent too.  It also reminds me that you don’t need to be an expert to speak at conference. You don’t need to know the n-th level of detail like Jonathan Lewis. You just need a story to tell about something you did that was interesting! I assume we’re all doing something interesting in our jobs from time to time? If you are not, you might want to try to find a job with interesting things to do, then talk about them!

One unique thing was being walked onto stage to a thumping good tune (I picked Human, by Rag’n’Bone Man) by an anthropomorphisation of the POUG Logo – basically a IMG_1571young ‘Oracle’ dressed like Obi-Wan Kenobi. They were also there to help out the delegates and speakers with any queries or problems they may have. Usefully they were all current or recently completed students on their way to being a doctors, vetenarians or structural engineers. High quality help indeed!

Many thanks to Kamil, Luiza and everyone else at POUG for a wonderful conference to remember.

IMG_1536I was debuting a new conference talk about Histograms. The talk went well, I think, but there are a couple of tiny kinks to iron out for the next time I present it – in Slovenia and Croatia – next month. Best remove the POUG-specific slides too, right Adam!

 

 

 

 

 

 

 

oh, and the talk of the Conference was…Martin Widlake, with a gigantic beer!

IMG_1610

Martin Widlake giving a rather good talk about the parallels (and perils) of Beer and Development

 

 

Online Index Rebuild Problem in 12C/18C/19C

When building an index in Oracle Enterprise edition using the keyword “ONLINE”, if the index build fails for whatever reason a messy set of extents can get left behind.

In Oracle 12.1 [update still there in 12.2 and 18C and 19.6] there’s a job (located in DBA_SCHEDULER_JOBS) to help with this: “SYS”.”CLEANUP_ONLINE_IND_BUILD” which ticks away, undocumented, in the background cleaning up “rogue” extents once an hour, every hour (start time based upon job creation time)However, if I run a CREATE INDEX … ONLINE command on a table and the index creation happens to coincide with the invocation of the cleanup job I get the following error in my alert log

ADR Home = /u01/app/oracle/diag/rdbms/orcl/orcl1:
*************************************************************************
2017-05-12 05:28:37.191123 -04:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_j000_12235.trc:
ORA-12012: error on auto execute of job "SYS"."CLEANUP_ONLINE_IND_BUILD"
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
ORA-06512: at "SYS.DBMS_PDB", line 76
ORA-06512: at line 4

Oh no! The job has failed to clean up the extents that I was currently using. Phew!

The CREATE INDEX … ONLINE command completed with success a few minutes later.

A few minutes after that, an update statement failed with:

ORA-08102 index key not found for obj# 107102 file 16 block 1171234

Yes, I have a corrupt index! index_fadeMy assumption is that the cleanup job had removed all of the extents which had been allocated before the cleanup job started. The index create statement did not notice as the current extent removal attempt blocked (hence the job error) and future to-be created extents were not affected. However, most of the index had been “cleaned up”. #sadface

Solution: re-create the index and make sure the index creation is complete before your hourly job kicks in. Better still, disable the undocumented hourly job… make sure get the full backing of an Oracle Support SR first!

Next Step: Raise SR with Oracle about this!

Version Check 12/11/2018:

select banner_full from v$version

BANNER_FULL
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Oracle Database 18c EE High Perf Release 18.0.0.0.0 - Production
Version 18.1.0.0.0


21:17:19 SYS @ ORCL > select owner,job_name,start_date,repeat_interval,state from dba_scheduler_jobs where job_name = 'CLEANUP_ONLINE_IND_BUILD';

OWNER JOB_NAME START_DATE REPEAT_INTERVAL STATE
------------------------------ ------------------------------ ----------------------------------- ------------------------------ ---------------
SYS CLEANUP_ONLINE_IND_BUILD 07-FEB-18 08.13.44.415343 PM +00:00 FREQ = HOURLY; INTERVAL = 1 SCHEDULED

 

 

MOS Solution Note 2280374.1 is not helpful:

Solution

Rebuild the index at a time other than the time frame when “SYS”.”CLEANUP_ONLINE_IND_BUILD” is running as auto job. #facepalm #whatifittakesover1hourtobuild