Oracle
RAC 10g x86 on SUSE LINUX Enterprise Server 9 x86
Oracle has recently released a new 10g RDBMS version with the first
service pack integrated: 10.1.0.3. It gives better support for SuSE and
solves some installation issues.
In addition, SuSE has released a new orarun version which makes the
installation process a little bit easier.
The plain RDBMS installation is the smoothest I have performed so far.
Unfortunately, RAC is a little bit more complex and requires some
additional steps and some architectural decisions to be made beforehand.
First of all you need at least two machine (physical or virtual) and a
shared common storage to which they both have equal access.
If you only wish to set up a test environment I suggest WMWare
ESX (or the GSX if you have only a client).
Otherwise shared storage accessible over a storage area network (SAN)
would be the best solution for a production environment.
There is even the possibility of using NFS, but I would discourage
it... too unreliable (ok, you could use it in a cheap testing setup,
but do not try it on a production system!).
For a reliable production setup using NFS, preferably use Oracle RAC
with Network Appliance NFS servers. There is even a Filer
Simulator available for download from Network Appliance.
I will cover the installation of the 10.1.0.3 version which can be
downloaded from OTN.
I will describe an installation using a common storage on a SAN. The
storage will be managed by oracle Automatic Storage Management (ASM).
You have other choices: oracle cluster files system (ocfs) which is
supported in the default SLES9 kernel, raw devices or a third party
(and perhaps unsupported) common shared storage.
You need two componenents: ship.db.lnx32.cpio.gz
(the DB installation),
and the cluster service ship.crs.lnx32.cpio.gz.
The latter should be installed before the former, in other words please
install the cluster services component first.
The use of ASM was chosen for simplicity and for testing purposes (I
lack experience on ASM). ASM can be a good replacement for linux's
logical volume management (LVM) or device mapper (DM). However, when
using ASM the basic I/O layer still depends on having direct access to
raw disk partitions (you'll see why it's important to understand this
later on).
After installing the basic system make sure you have the libX, libaio,
compat, libaio-devel and openmotif (even the 32 bit version).
make-3.80-184.1
gcc-3.3.3-43.24
compat-2004.7.1-1.2
XFree86-libs-4.3.99.902-43.22
libaio-devel-0.3.98-18.4
libaio-0.3.98-18.4
openmotif-libs-2.2.2-519.1
openmotif-2.2.2-519.1
Installing the orarun package will make your installation easier so I
recommend installing it. However, read the notes below before
installing it.
Last version is orarun-1.8-109.5 which can be downloaded from the suse
website (actually their ftp).
Note on gcc:
gcc_old-2.95.3-11 is not actually necessary (as described on some
website). On the contrary the linking phase needs gcc 3.x!!!
So be warned: if you are going to install older gcc for any puprpose,
make sure that oracle looks for the 3.x version during the relink.
Note on orarun:
Orarun is a useful package which can simplify the preinstallation part.
The new orarun checks for gcc_old but does not depend on it anymore.
The operations from here on are to be performed on every node:
linux: # rpm -Uvh
orarun-1.8-109.5.i686.rpm
The orarun package addresses the "infamous" orainstaller issue, which
manifests itself with the following error message when invoking the
oracle Univerasl Installer using runInstaller:
Unable to load native library:
/tmp/OraInstall2004-02-24_10-40-59AM/jre/lib/i386/libjava.so: symbol
__libc_wait, version GLIBC_2.0 not defined in file libc.so.6 with link
time reference.
You no longer need to install (or create by yourself) the patch
#3006854 for __libc_wait.
Instead, simply modify the /etc/profile.d/oracle.sh as follows, adding:
export LD_PRELOAD=/usr/lib/libInternalSymbols.so
LD_ASSUME_KERNEL=2.4.21
Setting LD_PRELOAD in this way will help in solving the above issue.
Create the directory tree for the oracle installation (look at the
standard OFA): the default is /opt/oracle/product/10g/db_1.
I prefer /u01/app/oracle/product/10g/db_1
linux: # mkdir -p
/u01/app/oracle/product/10g/db_1
linux: # mkdir /u01/app/oracle/product/10g/crs
Make sure to change the ownership of the tree with chown (the owner
should be the oracle user and the group should be the oinstall group).
linux: # chown -r oracle:oinstall
/u01/app/oracle
Now you can modify some files in /etc:
- /etc/passwd: change the shell for the oracle user created by
orarun (default is /bin/false);
- /etc/group: oracle user should belong to dba and oinstall;
- /ets/sysconfig/oracle for ORACLE_BASE, ORACLE_HOME, ORACLE_SID
and several kernel parameters plus the starting parameter for the
oracle script in /etc/init.d (useful during machine boot).
- /etc/profile.d/oracle.sh (or oracle.csh depending on the shell
you chose above). Make sure you set LD_ASSUME_KERNEL='2.4.21' as
described above (other values could be used: read the paper by Ulrich
Drepper at http://people.redhat.com/drepper/assumekernel.html).
Note: I attended a Red Hat workshop about "RAC installation on redhat
AS 3". It helped me to gain experience to perform the installation on
SLES9 more easily.
In that workshop they adviced me not to set the other environment
variables and to keep only the ORACLE_BASE environment setting.
It seems, on Red Hat, you can't complete the installation properly
otherwise... I was able to set the variable on SuSE without problems
(it helps to solve relinking issues).
You are free to follow your own judgment for the best installation.
The operating systems of each node need to be configured in preparation
for a RAC installation.
Modify the /etc/hosts (I suggest you to do this even if you have a DNS)
inserting all the definition for the nodes. Here is an example:
---------------------------------------------------------------------------------------------
127.0.0.1 localhost
# special IPv6 addresses
::1
localhost ipv6-localhost ipv6-loopback
fe00::0 ipv6-localnet
ff00::0 ipv6-mcastprefix
ff02::1 ipv6-allnodes
ff02::2 ipv6-allrouters
ff02::3 ipv6-allhosts
192.168.24.61 sles9rac2.ras sles9rac2
192.168.24.60 sles9rac1.ras sles9rac1
192.168.24.63 sles9rac2-vip.ras sles9rac2-vip
192.168.24.62 sles9rac1-vip.ras sles9rac1-vip
192.168.255.2 rac2-int.ras rac2-int
192.168.255.1 rac1-int.ras rac1-int
------------------------------------------------------------------------------------------------
You need two NICs for node: one for public connections while the others
for the interconnect.
The virtual ip has to be set but it is not associated with any physical
adapter yet. The configuration will be performed later by oracle.
All the configuration on each of the nodes should be identical. I
suggest you transfer the hosts file using scp instead of simply cutting
and pasting the entries. Then change the permission on the file:
linux: # chmod u-w /etc/hosts
otherwise you could have problems when changing the network
configuration using yast.
Restart your network services with:
linux: # /etc/init.d/networking restart
The above part is important: without this, you risk having the
installation stop while attempting to perform its tasks on each remote
node.
Now you need to set the ssh properly for the oracle user.
Go in the oracle user home (on SuSE, by default, it is /opt/oracle).
Create the .ssh directory
linux: # mkdir .ssh
Then you need to generate a couple of private and public keys for ssh.
This is the first step in generating the ssh configuration which is
going to allow the installation to be performed on every node at once.
Below is an example taken from a system of mine. I gave no passphrase.
oracle@sles9rac2:~> ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/oracle/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /opt/oracle/.ssh/id_rsa.
Your public key has been saved in /opt/oracle/.ssh/id_rsa.pub.
The key fingerprint is:
38:d3:7f:57:38:63:f4:94:9b:e3:38:7b:f7:77:13:ac oracle@sles9rac2
You can also choose to use a different protocol (example: -t dsa).
Now you have two files: id_rsa and id_rsa.pub.
The first is the private key (to guard closely) while the second is the
public key which should be shared by all nodes.
After you have generated all the public keys for every node you have to
insert them in a file called authorized_keys2.
(you can copy them remotely to a single node and then accumulate them
using 'cat' into a single authorized_keys2 file).
Example:
oracle@sles9rac2:~> cat id_rsa.pub
>> authorized_keys2
You need to end up with a file containing all the keys of all the nodes.
Copy over that file to each node, placing it in the $HOME/.ssh
directory.
Another solution is to generate only one pair of keys on one node and
insert the public key into authorized_keys2 as described above. Then
you can copy the three files (id_rsa.pub, id_rsa, authorized_keys2)
over to every $HOME/.ssh directory on each node.
Now, for every node you have to connect to the other using all the
private and public name used in /etc/hosts (with and without domain).
Reply 'yes' to every question and make sure that you are no longer
prompted for a password.
At every second try with the same connection you shouldn't receive any
message or request. You need to be immediately authenticated and
presented with a shell prompt for the oracle installation to proceed
smoothly.
Warning!!!!!
If
you can be authenticated without password or any other request but if
an output (or a warning) is shown then oracle will interpret that as an
error, stopping the installation. So, solve any related issue/warning
before going ahead.
Here, you can see an example of the messages shown when establishing
the initial ssh connections:
oracle@sles9rac1:~/.ssh> ssh
oracle@192.168.255.1
The authenticity of host '192.168.255.1 (192.168.255.1)' can't be
established.
RSA key fingerprint is 4c:70:d1:4c:6c:71:5c:19:a6:87:14:38:e5:f7:7f:51.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.255.1' (RSA) to the list of known
hosts.
Last login: Wed Nov 10 16:18:43 2004 from 192.168.255.2
oracle@sles9rac1:~> exit
logout
Connection to 192.168.255.1 closed.
oracle@sles9rac1:~/.ssh> ssh oracle@192.168.255.2
The authenticity of host '192.168.255.2 (192.168.255.2)' can't be
established.
RSA key fingerprint is 4c:70:d1:4c:6c:71:5c:19:a6:87:14:38:e5:f7:7f:51.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.255.2' (RSA) to the list of known
hosts.
Last login: Wed Nov 10 16:20:14 2004 from 192.168.24.60
oracle@sles9rac2:~> exit
logout
Connection to 192.168.255.2 closed.
Last step before starting:
You need to configure the common shared storage. I used ASM so I needed
at least three raw devices. The first one for the quorum disk (of at
least 200MB), a voting disk (also 200MB) and disk(s) to be managed by
ASM.
In /etc/raw insert the raw name and the block device to be bound to:
example:
raw1:sdb1
raw2:sdb2
raw3:sdb3
Now start the raw service:
oracle@sles9rac2:~> /etc/init.d/raw
start
From the manual: you should change ownership and permissions:
oracle@sles9rac2:~> chown oracle:dba
/dev/raw/raw1
oracle@sles9rac2:~> chown oracle:dba /dev/raw/raw2
oracle@sles9rac2:~> chown oracle:dba /dev/raw/raw3
oracle@sles9rac2:~> chmod 660 /dev/raw/raw1
oracle@sles9rac2:~> chmod 660 /dev/raw/raw2
oracle@sles9rac2:~> chmod 660 /dev/raw/raw3
On SuSE the Oracle user (the one installed by orarun) is part of
group disk having the rights
to read and write on raw devices.
Just to be sure: check if your oracle user is part of disk (if not add
it editing your /etc/groups or using yast), and try to read (and, on a
not yet used device, write) on a couple fo raw devices.
oracle@sles9rac2:~> id
uid=100(oracle) gid=102(dba)
groups=6(disk),101(oinstall),102(dba),103(oper)
oracle@sles9rac2:~> dd if=/dev/raw/raw1 of=/tmp/foo bs=4096 count=8
8+0 records in
8+0 records out
Now the system has been preconfigured. You only need to unpack the
downloaded oracle engines and install them:
oracle@sles9rac2:~> gunzip
ship.crs.lnx32.cpio.gz
oracle@sles9rac2:~> cpio -imdv
ship.crs.lnx32.cpio
You are ready to install the oracle cluster service:
- if you are in a remote machine make sure your X server is running
and export the DISPLAY: export DISPLAY=<your local IP>:0.0;
- launch runInstaller from Disk1 directory with the command
"./runInstaller"
I'm adding some images which can increase the clarity of the next steps:
Change the destination to the crs directory created earlier.

Select orainstall as group for performing installations and launch the
required script as root.
This is a critical step. You need to insert the name of the public and
private nodes and the cluster name.
The names have to be identical to the ones listed in /etc/hosts.




Then simply launch the script as root on every node.

If everything went fine you are ready for the database install:
oracle@sles9rac2:~> gunzip
ship.db.lnx32.cpio.gz
oracle@sles9rac2:~> cpio -imdv
ship.db.lnx32.cpio
Launch the unpacked runInstaller from Disk1 directory and perform the
usual steps.
The destination home needs to be the HORACLE_HOME.
A check of the installed packages is performed.


Later you have to lanch another scripts as root on all nodes. Before
doing it you need to export the DISPLAY if you are remotely installing
the components.

The script will open a new window. Deselect the private interface and
carry on:

Insert the VIP (vritual IP) with the same definition as listed in
/etc/hosts.

You can skip the configuration assistant for the listener.
This concludes the installation of the database software.
Link the existing oratab to the one needed by oracle (from root):
linux: # ln -s /etc/oratab /var/opt/oracle/oratab
Now you only need to create your database.
Notes on cssd and ASM
If you are using ASM the default configuration is wrong and after a
reboot you could get a: ORA-29702 or ORA-29701.
In /etc/oratab set at Y the DB you wish to be started automatically:
*:/u01/app/oracle/product/10.1/db_1:N
+ASM:/u01/app/oracle/product/10.1/db_1:Y
PITIA:/u01/app/oracle/product/10.1/db_1:Y
Then in /etc/inittab move the cssd line before the init 3 servicing:
l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
# Starting Cluster Deamon for ASM
h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1
</dev/null
l3:3:wait:/etc/init.d/rc 3
#l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
(make sure you don't have two init.cssd lines).
Now you can test the reboot (in /etc/sysconfig/oracle you need to
decide which components to start on reboot).
Have fun!
Contact information:
fabrizio.magni _at_ europe.com
fabrizio.magni _at_ rasnet.it