Building a highy available file server with NFS v4, DRBD 8.3, and Heartbeat on Centos 6.
For the purpose of this Tutorial we will be using a minimal installation of CentOS 6 / RHEL 6 on both of the servers. I like to use the minimal installation because I think we have more control of what is installed on the system and this way we keeping the system as clean and light as possible. Redhat/ CentOS 6 delivers by default NFSv4, Kernel 2.6.32 and ext4 file system, which will boost the performance of your file server.
Preparing the servers
After we have the OS installed we will disable selinux, and iptables and get CentOS /Redhat up to date. We can activate IPTABLES and selinux back again but for now we disabled it to make things easier to troubleshoot in case we need.
- Disabling selinux
# vi /etc/selinux/config
SELINUX=disabled
- Turn Off iptable from system start up
# chkconfig iptables off
- Update the system
# yum –y update
Now we will install some packages that will be necessary for DRBD installation and configuration.
# yum install wget ntp
We need to make sure that ntpd is always running to keep both servers time synchronized.
# chkconfig ntpd on
I also always like to have gcc and make available in case we need to compile any code that is not available in rpm packages.
# yum install gcc make
My preferred text editor is VIM in which I use through out this tutorial (vi), you can use any other you like.
# yum install vim-enhanced
# echo "alias vi='vim'" >> ~/.bashrc
I will assume here that we have setup the network and hostname during the OS installation. If you haven’t done that yet it is probably a good time.
Here are the files you will need to modify:
- Hostname / Gateway
/etc/sysconfig/network
- ETH0 interface
/etc/sysconfig/network-scripts/ifcfg-eth0
- DNS Servers
/etc/resolv.conf
Okay now to take affect some of the configurations that we have done we will reboot the servers. This is only necessary because we disabled selinux and updated the kernel, all the other services could have been restarted without a server reboot.
# reboot
DRDB 8.3 and RHEL / CentOS 6
As you probably know RHEL 6 / CentOS 6 does not have DRBD on any of the yum repository. You need a support contract with RHEL to get DRBD (they partnered with Linbit after it was decided to not support DRBD in EL6 because DRBD didn't get into the mainline kernel until 2.6.33, and EL6 has 2.6.32. However you can alternatively install it from source or rpm packages or use the yum ELRepo repository (http://elrepo.org).
- Option 1 – RPM Packages
# wget http://dl.atrpms.net/el6-`arch`/atrpms/stable/drbd-kmdl-`uname -r`-8.3.8.1-30.el6.`arch`.rpm
# wget http://dl.atrpms.net/el6-x86_64/atrpms/stable/drbd-8.3.8.1-30.el6.`arch`.rpm
# rpm –ivh drbd-kmdl-`uname -r`-8.3.8.1-30.el6.`arch`.rpm drbd-8.3.8.1-30.el6.`arch`.rpm
- Option 2 – ELRepo
# rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm
# vi /etc/yum.repos.d/elrepo.repo
enabled=0
# yum --enablerepo=elrepo install drbd83-utils kmod-drbd83
Lets now set up our DRBD device. We will create a 40GB LVM to be used by DRBD.
First of all, verify that on both servers ntpd is running and that the time is equal.
[root@c6server1 ~]# service ntpd status
ntpd (pid 1072) is running...
[root@c6server1 ~]# date
Mon Aug 15 10:48:14 PDT 2011
[root@c6server2 ~]# service ntpd status
ntpd (pid 1088) is running...
[root@c6server2 ~]# date
Mon Aug 15 10:48:14 PDT 2011
Create the Logical Volume on both servers:
[root@c6server1 ~]# lvcreate -L 40GB -n drbd-main vg_c6server1
[root@c6server2 ~]# lvcreate -L 40GB -n drbd-main vg_c6server2
~ Notice that each of my servers have different Volume Groups (vg_c6server1 and vg_c6server2). If you copy and past that you will need to make sure that you change them to whatever you have named.
!!!!! VERY IMPORTANT – DON’T FORMAT AND MOUNT THE VOLUME !!!!!
Save the original global configuration file on both servers:
mv /etc/drbd.d/global_common.conf /etc/drbd.d/global_common.sample
and create a new file:
# vi /etc/drbd.d/global_common.conf
global { usage-count no; }
common {
syncer { rate 10M; }
}
Create a new resource file that I called “main”, on both servers they have to be equal:
# vi /etc/drbd.d/main.res
resource main {
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; }
on c6server1 {
device /dev/drbd0;
disk /dev/vg_c6server1/drbd-main;
meta-disk internal;
address 10.0.0.1:7788;
}
on centos6-2 {
device /dev/drbd0;
disk /dev/vg_c6server2/drbd-main;
meta-disk internal;
address 10.0.0.2:7788;
}
}
Now we have to create the metadata on both servers:
[root@c6server1]# drbdadm create-md main
[root@c6server2]# drbdadm create-md main
and we can now start DRBD on both servers at the same time:
[root@c6server1 drbd.d]# service drbd start
Starting DRBD resources: [ d(main) s(main) n(main) ].
[root@c6server2 drbd.d]# service drbd start
Starting DRBD resources: [ d(main) s(main) n(main) ].
You have two ways that I know to verify that DRBD is running properly:
# service drbd status
# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by dag@Build64R6, 2011-08-08 08:54:05
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:41941724
As you can see they say that it is connected but ro is Secondary/Secondary meaning that we haven’t told the system which one is the Primary server (master) that contains the block to be replicated. Once we tell the system who is the master it will start the synchronization.
We will tell DRBD that server1 is the Primary server:
On server1:
# drbdsetup /dev/drbd0 primary –o
# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by dag@Build64R6, 2011-08-08 08:54:05
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:503808 nr:0 dw:0 dr:504472 al:0 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:41437916
[>....................] sync'ed: 1.3% (40464/40956)M
finish: 1:06:40 speed: 10,340 (10,280) K/sec
The synchronization started and it will take a little while to be completed. Please wait until it is done and move to the next step.
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:41941724 nr:0 dw:0 dr:41942388 al:0 bm:2560 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
Now that our servers are in sync we can create format our /dev/drbd0 with our preferred file system. In my case I will use ext4.
On server1:
# mkfs.ext4 /dev/drbd0
Configuring NFS exports for Heartbeat integration
Okay, now we have our DRBD up and running. Great!! Next step is to set up out NFS export and then install heartbeat so we can have an automatic fail over system.
Lets prepare our NFS first:
On both servers:
# mkdir /drbd
# vi /etc/exports
/drbd/main *(rw)
On server1 only :
# mount /dev/drbd0 /drbd
# mkdir /drbd/main
NFS stores some information about your NFS mounts at /var/lib/nfs and since those information will have to be mirrored we will have to move them to the DRBD device:
# mv /var/lib/nfs/ /drbd/
this might generate an error such as:
mv: cannot remove `/var/lib/nfs/rpc_pipefs/cache': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/nfsd4_cb': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/statd': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/portmap': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/nfs': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/mount': Operation not permitted
mv: cannot remove `/var/lib/nfs/rpc_pipefs/lockd': Operation not permitted
but do not worry about it because it will create the directories anyways.
# mv /var/lib/nfs /var/lib/nfsBackup
Then symlink /var/lib/nfs to our /drbd directory:
# ln -s /drbd/nfs/ /var/lib/nfs
# umount /drbd
On server2 only:
# mv /var/lib/nfs/ /var/lib/nfsBackup
# ln -s /drbd/nfs/ /var/lib/nfs
The symbolic link will be broken since the /dev/drbd0 is not mounted. This will work in case of NFS fail-over.
Heartbeat installation and configuration
We will install heartbeat from the EPEL repository:
# rpm -Uvh http://download.fedora.redhat.com/pub/epel/6/x86_64/epel-release-6-5.noarch.rpm
# vi /etc/yum.repos.d/epel.repo
enabled=0
# yum --enablerepo=epel install heartbeat
Create the following file on both servers with the exact same content:
# vi /etc/ha.d/ha.cf
keepalive 2
deadtime 30
bcast eth0
node c6server1 c6server2
# same thing here for the node names use the same hostname of your hosts… this needs to be whatever uname –n answers.
The next setp is to create the resource file for heartbeat on both servers with exact same content again:
# vi /etc/ha.d/haresources
c6server1 IPaddr::10.0.0.100/24/eth0 drbddisk::main Filesystem::/dev/drbd0::/drbd::ext4 nfslock nfs
# first word is the hostname of the primary server then the IP 10.0.0.100 is the one I choose to be the virtual IP to be moved to the slave in case of a failure.
The last thing is to create the authentication file on both servers again with the same content:
# vi /etc/ha.d/authkeys
auth 3
3 md5 mypassword123
This password file should only be readable by the root user:
# chmod 600 /etc/ha.d/authkeys
Testing the Cluster
Ok now we should be ready to go… Lets test it!!
On both servers start heartbeat service:
# service heartbeat start
on server1
[root@c6server1 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 08:00:27:95:AB:B1
inet addr:10.0.0.1 Bcast:10.0.4.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe95:abb1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2189638 errors:0 dropped:0 overruns:0 frame:0
TX packets:30442386 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:188528923 (179.7 MiB) TX bytes:45853044392 (42.7 GiB)
eth0:0 Link encap:Ethernet HWaddr 08:00:27:95:AB:B1
inet addr:10.0.0.100 Bcast:10.0.4.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:305 errors:0 dropped:0 overruns:0 frame:0
TX packets:305 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:35012 (34.1 KiB) TX bytes:35012 (34.1 KiB)
[root@c6server1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_c6server1-lv_root
6.6G 905M 5.4G 15% /
tmpfs 499M 0 499M 0% /dev/shm
/dev/sda1 485M 60M 401M 13% /boot
sunrpc 40G 176M 38G 1% /var/lib/nfs/rpc_pipefs
/dev/drbd0 40G 176M 38G 1% /drbd
Lets test the DRBD/NFS fail over now
You can shutdown server1… not a problem or simply stop heartbeat service:
On server1:
# service heartbeat stop
On server2:
[root@c6server2 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 08:00:27:8F:3B:50
inet addr:10.0.0.2 Bcast:10.0.4.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe8f:3b50/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:30447253 errors:0 dropped:0 overruns:0 frame:0
TX packets:2138369 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:45799991579 (42.6 GiB) TX bytes:173195698 (165.1 MiB)
eth0:0 Link encap:Ethernet HWaddr 08:00:27:8F:3B:50
inet addr:10.0.0.100 Bcast:10.0.4.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:208 errors:0 dropped:0 overruns:0 frame:0
TX packets:208 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:23480 (22.9 KiB) TX bytes:23480 (22.9 KiB)
Last Updated (Friday, 19 August 2011 18:27)



