KILL! KILL! KILL! (of Unix processes)

The start of this isn’t my post – I got it from here: but I wanted to reblog/repost and enhance it because as far as I can tell, 99% of all known DBA’s only use kill -9 to remove unhappy processes.


Original Post:

Useless Use of Kill -9 form letter

No no no.  Don't use kill -9.

It doesn't give the process a chance to cleanly:

1) shut down socket connections

2) clean up temp files

3) inform its children that it is going away

4) reset its terminal characteristics

and so on and so on and so on.

Generally, send 15, and wait a second or two, and if that doesn't
work, send 2, and if that doesn't work, send 1.  If that doesn't,
REMOVE THE BINARY because the program is badly behaved!**

Don't use kill -9.  Don't bring out the combine harvester just to tidy
up the flower pot.

**don’t remove your Oracle or any other binaries please.


 

I hope you found that useful. I know I did. But what do the numbers mean? Well, they are increasingly violent ways to ask the program to stop itself. The command kill -9 isn’t asking the program to stop, it’s asking the O/S to stop running the program now, regardless of what it’s doing.

Run order of kills:

kill -15 : this is the equivalent of kill -sigterm and it the default. The program should terminate after it has finished what it is doing.

kill -2 : this is the equivalent of kill -sigint and is the same as pressing CTRL+C. This should mean “stop what you’re doing” — and it may or may not kill the program.

kill -1 : this is the equivalent of kill -sighup and tells the program that the user has disconnected. (e.g. SSH session or terminal window was closed). It usually results in a graceful shutdown of the program.

The executing program needs to be coded to recognise these kill signals, and all good software will spot them.

The other fun kill command is kill -sigstop. This can’t be blocked (like -9) as it’s an O/S level command too, but freezes the program execution like pressing CTRL+Z. You can continue the program execution later using kill -sigcont.

Advertisement

Linux Annoying Defaults, and changes when moving to RH/OEL7

So why does Linux have an alias for “ls” which turns on colour, by default, making some text impossible to read? eh?

alias ls=’ls –color=auto’

To stop this temporarily, you can “unalias ls”, but to stop it permanently for everyone:

vi /etc/profile.d/colorls.sh

comment out the line:

alias ll='ls -l --color=auto' 2>/dev/null
alias l.='ls -d .* --color=auto' 2>/dev/null
# alias ls='ls --color=auto' 2>/dev/null

And that’s it. Cured for life.

While I’m on, why oh why has so much pointlessly changed between RH/OEL6 and RH/OEL7.

Switching off the firewall is now

systemctl stop firewalld
systemctl disable firewalld

And changing the hostname now has a dedicated command all to itself, instead of just amending /etc/sysconfig/network (which you can still do)

hostnamectl set-hostnane new-host-name-here

And what does it do? It creates a /etc/hostname file (and sets the hostname so you don’t need to reboot, so you should use this method)

And another thing. Why has the ifconfig command vanished ?

ifconfig
-bash: ifconfig: command not found

pifconfig
lo
          inet addr:127.0.0.1   Mask:255.0.0.0
          inet6 addr: ::1/128 Scope: host
          UP LOOPBACK RUNNING

enp0s3    HWaddr 08:00:27:75:c8:1e
          inet addr:192.168.56.200 Bcast:192.168.56.255   Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe75:c81e/64 Scope: link
          UP BROADCAST RUNNING MULTICAST

enp0s8    HWaddr 08:00:27:58:20:01
          inet addr:10.10.0.1 Bcast:10.255.255.255   Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe58:2001/64 Scope: link
          UP BROADCAST RUNNING MULTICAST

or more correctly:

ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 12:00:00:00:00:11 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.201/16 brd 192.168.255.255 scope global enp0s3
    inet 192.168.56.211/24 brd 192.168.56.255 scope global enp0s3:1
    inet6 fe80::1000:ff:fe00:11/64 scope link
       valid_lft forever preferred_lft forever
3: enp0s8:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 12:00:00:00:00:21 brd ff:ff:ff:ff:ff:ff
    inet 10.10.0.1/24 brd 10.10.0.255 scope global enp0s8
    inet 169.254.249.91/16 brd 169.254.255.255 scope global enp0s8:1
    inet6 fe80::1000:ff:fe00:21/64 scope link
       valid_lft forever preferred_lft forever

The whole of (the unmaintained) net-tools has been deprecated. No more:

[root@rac12c01 ~]# netstat
-bash: netstat: command not found

We now need to learn to use the iproute2 suite of commands instead:

ifconfig -> ip addr (or ip link - e.g. ip link set arp on)
route    -> ip route
arp      -> ip neighbor (e.g. ip n show)
vconfig  -> ip link
iptunnel -> ip tunnel (add/change/del/show)
brctl    -> bridge
ipmaddr  -> ip maddr
netstat  -> ss (or a bunch of ip commands)

(or you could just yum install net-tools to get the old tools back, but that’s just not the right thing to do, is it)

You might want to yum install lsof and yum install nmap though. They aren’t there by default in OEL7

And another thing – tempfs being in memory being default. Why? It’s too small to be any good for anything really. To switch is back to being a real filesystem (and get your memory back)

systemctl mask tmp.mount
(outputs) ln -s '/dev/null' '/etc/systemd/system/tmp.mount'
... and reboot

and another thing – why is grep (and egrep and fgrep) now aliased to colourise your search results? To be honest, I happen to agree with this. Nice new default feature:

alias
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'

cat /etc/passwd  | grep oracle
oracle:x:500:500::/home/oracle:/bin/bash

More mini-rants will appear in this blog post as I fall across/remember the issues

RACCheck

Running RAC? (Why? No, really, WHY?  Never heard of DataGuard? With a broker?)

Running RAC?
Not sure if you’ve configured it correctly?
Not sure if you have all of the recommended initialisation parameters set?
All recommended RPM’s installed?
All daemons running?
etc, etc, etc,

Well, as of Oracle 11.2.0.4 where’s a new feature provided by default called RACCheck. You can find it installed in directory $ORACLE_HOME/suptools/raccheck, (or you can download it from MOS article 1268927.1) and it’s called “raccheck”. With a little sudo configuration, or the root passwords, you can check the configuration on every node in a few minutes per node (run at a sensible time). All the basics appear to be covered, and you get a nice list of anomalies out of the system in HTML format.

I don’t necessarily agree with some of the errors/warnings produced (you might want the “problems” it’s finding!), but it gives you cause to re-think about an element of the system that may be configured in a non-standard way, and you get lots of relevant and useful links to MOS articles.

e.g. One problem: 

WARNING SQL Check Some user sessions lack proper failover mode (BASIC) and method (SELECT) All Databases

Can be happily ignored as I’m using a SCAN listener, which renders this WARNING irrelevant.

but I would recommend that you use the utility and accept/understand any exceptions. It should help stabilise any RAC installations you may have.