CentOS 6.4 installation

My old Ubuntu 12.04 server had serious performance problems with NFS exported home directories and the decision has been taken to use CentOS as the next server distro.

Introduction

My old Ubuntu server had serious performance problems with NFS exported home directories and the decision has been taken to use CentOS as the next server distro. The server had one HD with four partitions for the current system, previous system, tmp directories and swap, and a RAID6 array of four 1TB HDs, each with a single partition.

Revision history
2013/04/11 11:24 Created
2013/04/09 17:32 Added section for bind mounted home directories

Getting ready

Download netinstall, DVD1 and DVD2 ISOs:

wget http://www.mirrorservice.org/sites/mirror.centos.org/6.4/isos/i386/CentOS-6.4-i386-netinstall.iso
wget http://www.mirrorservice.org/sites/mirror.centos.org/6.4/isos/i386/CentOS-6.4-i386-bin-DVD1.iso
wget http://www.mirrorservice.org/sites/mirror.centos.org/6.4/isos/i386/CentOS-6.4-i386-bin-DVD2.iso

Burn netinstall ISO to CD, make DVDs available on a local webserver or on a USB drive.

Note down the server’s old network settings if they are not obtained via DHCP.

Shut the system down and disconnect RAID disks (just in case).

shutdown -P now

Installation

Boot from netinstall CD.

Press tab at boot menu and add ‘vnc’ boot option. This may be required since the standard console based anaconda installer does not always allow custom partitioning and the specification of other installation details; for these, the graphical installer is needed.

Select language, and keyboard type and layout, configure network connection.

Once the VNC server is up, connect via Remote Desktop Viewer running on another machine as instructed. On Ubuntu vinagre can be used for this.

Follow graphical installation, select basic server install type and include NFS server.

Once the installation is complete, reboot. At this point the server will be running in console mode without an X11 front-end. For all admin tasks, I prefer to connect to the server via ssh and run the various GUI admin tools from my desktop machine.

On the server terminal log in as root and enable X11 auth needed for ssh forwarding:

yum install xorg-x11-xau*
exit

From a desktop machine connect to the server via ssh:

ssh -X root@server

Update the system:

yum update

If necessary, e.g. the kernel packages have been updated, shut the system down and re-boot, then log in again.

Configure yum to exclude kernel packages from future updates:

nano /etc/yum.conf

exclude = kernel*

Remember to update any excluded packages manually when convenient.

You can also use the system admin GUI for package management:

gpk-application

Set static network address

During installation network settings may have been obtained automatically by DHCP; if a static network configuration is desired, follow the steps below, replacing the various IP addresses with ones corresponding to your network.

system-config-network

; eth0 interface
Inet addr: 10.0.0.253
Mask:      255.0.0.0
Bcast:     10.255.255.255
; Router
Gateway:   10.0.0.254
; DNS
search your.domain
nameserver 10.0.0.254

Configure postfix

Modify postfix so that mail will get delivered to ‘proper’ external e-mail addresses.

cp /etc/postfix/main.cf /etc/postfix/main.cf.ori
nano /etc/postfix/main.cf

Edit the following lines; replace relayhost with your ISP’s SMTP server.

myorigin = your.domain
inet_interfaces = all
relayhost = smtp.yourisp.com

service postfix restart

Additional useful mail admin commands:

Check mail queue:

mailq

Remove message from queue:

postsuper -d msgID

Send mail from the command line or from scripts:

mail
mailx
sendmail

Install APC UPS daemon

wget http://www.mirrorservice.org/sites/dl.fedoraproject.org/pub/epel/6/i386/apcupsd-3.14.10-1.el6.i686.rpm
yum install apcupsd-3.14.10-1.el6.i686.rpm

Since I am using a SmartUPS with an APC ‘smart’ serial cable, I had to make the following changes:

cp /etc/apcupsd/apcupsd.conf /etc/apcupsd/apcupsd.conf.ori
nano /etc/apcupsd/apcupsd.conf

UPSNAME ServerPS
UPSCABLE smart
UPSTYPE apcsmart
DEVICE /dev/ttyS0
NISIP 127.0.0.1

Change the e-mail address used for notifications:

cd /etc/apcupsd
for id in  apccontrol changeme commfailure commok offbattery onbattery  ; do cp -i $id $id.ori ; sed -e 's/SYSADMIN=root/SYSADMIN=system.administrator@your.domain/' < $id.ori > $id ; done

service apcupsd restart

For an admin GUI to apcupsd, take a look at apcupsd-gui and apcupsd-cgi.

Enable SMART monitoring for HDDs

cp /etc/smartd.conf /etc/smartd.conf.ori
nano /etc/smartd.conf

DEVICESCAN -n standby,7,q -m system.administrator@your.domain -s (S/../.././01|L/../15/./02)

service smartd restart

Configure automatic updates

yum install yum-cron
cp -i /etc/sysconfig/yum-cron /etc/sysconfig/yum-cron.ori
nano /etc/sysconfig/yum-cron

DOWNLOAD_ONLY=yes
MAILTO=system.administrator@your.domain
SYSTEMNAME="server"

service yum-cron restart

Add users

I prefer private group IDs, and manually specify UIDs and GIDs starting at 1000 to conform to the Ubuntu convention:

system-config-users

Activate RAID

Configure the RAID array:

nano /etc/mdadm.conf

ARRAY /dev/md0 level=raid6 num-devices=4 UUID=5f8a6291:38b5cf4e:dc0d88ba:780ef6a3 auto=md
devices=/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdb1
MAILADDR system.administrator@your.domain
MAILFROM root.server@your.domain

Shut the system down, connect RAID disks and reboot

shutdown -P now

Once the server finished booting, log in again from your desktop machine connect to the server via ssh:

ssh -X root@server

The kernel should have picked up the RAID info and assemble the array automatically. The result should look like this:

ls -l /dev/md*

brw-rw----. 1 root disk 9, 0 Apr 11 21:59 /dev/md0

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdb1[3] sdc1[0] sdd1[1] sde1[2]
1953522944 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices:

mdadm --detail /dev/md0

/dev/md0:
Version : 0.90
Creation Time : Fri Sep  2 17:16:11 2011
Raid Level : raid6
Array Size : 1953522944 (1863.02 GiB 2000.41 GB)
Used Dev Size : 976761472 (931.51 GiB 1000.20 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Apr 12 16:08:18 2013
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 5f8a6291:38b5cf4e:dc0d88ba:780ef6a3
Events : 0.4607673
Number   Major   Minor   RaidDevice State
0       8       33        0      active sync   /dev/sdc1
1       8       49        1      active sync   /dev/sdd1
2       8       65        2      active sync   /dev/sde1
3       8       17        3      active sync   /dev/sdb1

Once the RAID array is up and running, mount it and make it available via NFS.

Configure md monitoring and data scrubbing

It is absolutely essential to implement a monitoring and data scrubbing regime for RAID arrays, otherwise errors can build up silently, which will result in catastrophic data loss.

The check is run automatically by cron; to modify the time or frequency of the check the cron job has to be edited:

nano /etc/cron.d/raid-check

To set the parameters of the check:

nano /etc/sysconfig/raid-check

Settings used for checking all RAID volumes:

ENABLED=yes
CHECK=check
NICE=low
CHECK_DEVS=""
REPAIR_DEVS=""
SKIP_DEVS=""

The progress of the check can be seen by

cat /proc/mdstat

and the number of errors can be displayed by

cat /sys/block/md0/md/mismatch_cnt

If the count is not zero the problems will have to fixed manually.

Add more users

For some users on the system, the home directories reside on the RAID volume. One option is to use symbolic links in the /home directory pointing to the user home directories on the RAID volume:

# create new user 'user1' as above
sudo mv /home/user1 /home/user1.local
sudo ln -s /mnt/raid/homes/user1 /home/

Unfortunately this has the potential to break if changes occur to the directory structure on the RAID volume. Some programs remember the de-referenced paths for symbolic links. If the symbolic link is modified to account for changes in the directory structure, these programs will try to use the old paths that may no longer exist or may point to stale versions of the files.

A better option is to use bind mounts:

# create new user 'user1 as above
sudo mv /home/user1 /home/user1.local
sudo mkdir /home/user1
sudo nano /etc/fstab

/mnt/raid/homes/user1        /home/user1    none rw,bind 0 0

mount -a

Configure NFS exports

Configure idmapd, which will be needed by the NFS server; this has to match the idmapd configurations on the NFS clients:

nano /etc/idmapd.conf

Lines to change:

# set your own domain here, if id differs from FQDN minus hostname
# Domain = localdomain
Domain = hlan.your.domain

Create mount points in /mnt and in /exports, add fstab entries for the RAID volume and the bind mounts for NFS:

mkdir /mnt/raid
mkdir -p /exports/{books,homes,music,opt,pictures,SW,video,VMs}
nano /etc/fstab

#
# raid array - sdc1, sdd1, sde1, sdb1
#
UUID=1960fa99-593b-4601-99bf-d5064fdef53e /mnt/raid               ext4    relatime        0 0
#
# bind mounts for NFSv4 exports
#
/mnt/raid/homes    /exports/homes       none rw,bind 0 0
/mnt/raid/music    /exports/music       none rw,bind 0 0
/mnt/raid/video    /exports/video       none rw,bind 0 0
/mnt/raid/VMs      /exports/VMs         none rw,bind 0 0
/mnt/raid/opt      /exports/opt         none rw,bind 0 0
/mnt/raid/SW       /exports/SW          none rw,bind 0 0
/mnt/raid/books    /exports/books       none rw,bind 0 0
/mnt/raid/pictures /exports/pictures    none rw,bind 0 0

Mount it all:

mount -a

We are now ready to define our NFS exports:

nano /etc/exports

#
# NFSv4 exports
###############
#
/exports          *.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,fsid=0)
#
# pc1
/exports/homes    pc1.hlan.your.domain(rw,no_subtree_check,sync,no_root_squash,nohide)
/exports/opt      pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
/exports/VMs      pc1.hlan.your.domain(rw,no_subtree_check,sync,no_root_squash,nohide)
/exports/SW       pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
/exports/books    pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
/exports/music    pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
/exports/video    pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
/exports/pictures pc1.hlan.your.domain(ro,no_subtree_check,sync,no_root_squash,nohide)
#
# NFSv3 exports
###############
#
# network boot root file systems
# currently not used
#
#/netboot/rootfs/pc2    pc2.hlan.your.domain(rw,no_root_squash,no_subtree_check,async)

Activate NFS with the new shares:

service nfs restart

NTP

Specify ntp servers:

nano /etc/ntp.conf

and add the following lines

server uk.pool.ntp.org
server uk.pool.ntp.org
server uk.pool.ntp.org

Send log messages to separate log file instead of the default /var/log/messages:

nano /etc/sysconfig/ntpd

OPTIONS="-u ntp:ntp -l /var/log/ntpd -p /var/run/ntpd.pid -g"

touch /var/log/ntpd
chown ntp.ntp /var/log/ntpd
service ntpd restart

If you want to modify the verbosity of the log, take a look at the logconfig directive:

man ntp_misc

System maintenance tasks

Automatic checks on boot were causing long delays at seemingly always at worst possible time, so those checks were disabled. The RAID volume now has to be checked manually:

/etc/init.d/nfs stop
umount /exports/*
umount /mnt/raid/
fsck -f -C /dev/md0 
mount -a
/etc/init.d/nfs start

Updates are only downloaded automatically and have to be installed manually:

yum update

If the normally excluded kernel packages are also to be updated:

yum --disableexcludes=all update

Install rsyncd

For this particular server, rsync will only be used for making backups. In this case there is no real benefit in running an rsyncd service, if anything, rsync+ssh is preferable from the security point of view.

If the server was sharing, for example, a software repository mirror, there may be a performance advantage for rsyncd, but even then some other forms of access may still be preferable.

Configure services

The commands below were required on my system; for your installation, a different set of similar commands may be needed.

chkconfig abrtd off
chkconfig abrt-ccpp off
chkconfig irqbalance off
chkconfig lvm2-monitor off
chkconfig kdump off
chkconfig haldaemon off
chkconfig rpcgssd off

chkconfig --level 2345 smartd on
chkconfig --level 2345 apcupsd on
chkconfig --level 2345 yum-cron on
chkconfig --level 2345 ntpdate on
chkconfig --level 2345 atd on
chkconfig --level 2345 network on
chkconfig --level 2345 certmonger on
chkconfig --level 2345 netfs on

chkconfig --level 345 ntpd on
chkconfig --level 2 ntpd off
chkconfig --level 345 nfs on
chkconfig --level 2 nfs off
chkconfig --level 345 nfslock on
chkconfig --level 2 nfslock off

chkconfig --list | sort --key=3,7 | tee >(wc -l)

; off
abrt-ccpp       0:off   1:off   2:off   3:off   4:off   5:off   6:off
abrtd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
autofs          0:off   1:off   2:off   3:off   4:off   5:off   6:off
cgconfig        0:off   1:off   2:off   3:off   4:off   5:off   6:off
cgred           0:off   1:off   2:off   3:off   4:off   5:off   6:off
firstboot       0:off   1:off   2:off   3:off   4:off   5:off   6:off
haldaemon       0:off   1:off   2:off   3:off   4:off   5:off   6:off
ip6tables       0:off   1:off   2:off   3:off   4:off   5:off   6:off
ipsec           0:off   1:off   2:off   3:off   4:off   5:off   6:off
iptables        0:off   1:off   2:off   3:off   4:off   5:off   6:off
irqbalance      0:off   1:off   2:off   3:off   4:off   5:off   6:off
kdump           0:off   1:off   2:off   3:off   4:off   5:off   6:off
lvm2-monitor    0:off   1:off   2:off   3:off   4:off   5:off   6:off
netconsole      0:off   1:off   2:off   3:off   4:off   5:off   6:off
numad           0:off   1:off   2:off   3:off   4:off   5:off   6:off
oddjobd         0:off   1:off   2:off   3:off   4:off   5:off   6:off
psacct          0:off   1:off   2:off   3:off   4:off   5:off   6:off
quota_nld       0:off   1:off   2:off   3:off   4:off   5:off   6:off
rdisc           0:off   1:off   2:off   3:off   4:off   5:off   6:off
restorecond     0:off   1:off   2:off   3:off   4:off   5:off   6:off
rngd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
rpcgssd         0:off   1:off   2:off   3:off   4:off   5:off   6:off
rpcsvcgssd      0:off   1:off   2:off   3:off   4:off   5:off   6:off
saslauthd       0:off   1:off   2:off   3:off   4:off   5:off   6:off
sssd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
winbind         0:off   1:off   2:off   3:off   4:off   5:off   6:off
ypbind          0:off   1:off   2:off   3:off   4:off   5:off   6:off
; runlevel 3 - network services
nfs             0:off   1:off   2:off   3:on    4:on    5:on    6:off
nfslock         0:off   1:off   2:off   3:on    4:on    5:on    6:off
ntpd            0:off   1:off   2:off   3:on    4:on    5:on    6:off
rpcidmapd       0:off   1:off   2:off   3:on    4:on    5:on    6:off
; runlevel 2 - networked multi user
acpid           0:off   1:off   2:on    3:on    4:on    5:on    6:off
atd             0:off   1:off   2:on    3:on    4:on    5:on    6:off
auditd          0:off   1:off   2:on    3:on    4:on    5:on    6:off
certmonger      0:off   1:off   2:on    3:on    4:on    5:on    6:off
crond           0:off   1:off   2:on    3:on    4:on    5:on    6:off
cups            0:off   1:off   2:on    3:on    4:on    5:on    6:off
mdmonitor       0:off   1:off   2:on    3:on    4:on    5:on    6:off
messagebus      0:off   1:off   2:on    3:on    4:on    5:on    6:off
netfs           0:off   1:off   2:on    3:on    4:on    5:on    6:off
network         0:off   1:off   2:on    3:on    4:on    5:on    6:off
ntpdate         0:off   1:off   2:on    3:on    4:on    5:on    6:off
portreserve     0:off   1:off   2:on    3:on    4:on    5:on    6:off
postfix         0:off   1:off   2:on    3:on    4:on    5:on    6:off
rpcbind         0:off   1:off   2:on    3:on    4:on    5:on    6:off
rsyslog         0:off   1:off   2:on    3:on    4:on    5:on    6:off
smartd          0:off   1:off   2:on    3:on    4:on    5:on    6:off
sshd            0:off   1:off   2:on    3:on    4:on    5:on    6:off
yum-cron        0:off   1:off   2:on    3:on    4:on    5:on    6:off
; runlevel 1 - single user
apcupsd         0:off   1:on    2:on    3:on    4:on    5:on    6:off
blk-availability        0:off   1:on    2:on    3:on    4:on    5:on    6:off
cpuspeed        0:off   1:on    2:on    3:on    4:on    5:on    6:off
sysstat         0:off   1:on    2:on    3:on    4:on    5:on    6:off
udev-post       0:off   1:on    2:on    3:on    4:on    5:on    6:off
; 54 services in all

The default runlevel, as specified in /etc/inittab, is 3.

telinit 4
telinit 3
service --status-all

abrt-ccpp hook is not installed
abrtd is stopped
abrt-dump-oops is stopped
acpid is stopped
apcupsd (pid  17228) is running...
atd (pid  1526) is running...
auditd (pid  1588) is running...
automount is stopped
certmonger (pid  1538) is running...
Stopped [cgconfig]
cgred is stopped
cpuspeed is stopped
crond (pid  1515) is running...
cupsd (pid  1199) is running...
firstboot is not scheduled to run
hald is stopped
ip6tables: Firewall is not running.
IPsec stopped
iptables: Firewall is not running.
irqbalance is stopped
Kdump is not operational
lvmetad is stopped
mdmonitor (pid  17151) is running...
messagebus (pid  1182) is running...
netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
rpc.svcgssd is stopped
rpc.mountd (pid 17397) is running...
nfsd (pid 17462 17461 17460 17459 17458 17457 17456 17455) is running...
rpc.rquotad (pid 17393) is running...
rpc.statd (pid  1112) is running...
ntpd (pid  18365) is running...
numad is stopped
oddjobd is stopped
portreserve is stopped
master (pid  16932) is running...
Process accounting is disabled.
quota_nld is stopped
rdisc is stopped
restorecond is stopped
rngd is stopped
rpcbind (pid  1094) is running...
rpc.gssd is stopped
rpc.idmapd (pid 17448) is running...
rpc.svcgssd is stopped
rsyslogd (pid  1058) is running...
sandbox is stopped
saslauthd is stopped
smartd (pid 17210) is running...
openssh-daemon (pid  1393) is running...
sssd is stopped
winbindd is stopped
ypbind is stopped
Nightly yum update is enabled.

Firewall

Please note well that the standard firewall is disabled above. If your server is connected directly to the internet you should really be running a firewall.

Enable sudo

To enable sudo for one of the local user edit the soders file:

nano /etc/sudoers

and include the lines

%wheel  ALL=(ALL)    ALL
Defaults    mailto="system.administrator@your.domain"

Add the local users to the wheel group using

system-config-users

In some cases, e.g. when connecting to CentOS sshd from an Ubuntu openssh client, the XAUTHORITY env variable is not set. If sudo is then used, connections to the X11 display are refused. In this case use the following:

XAUTHORITY=/home/username/.Xauthority sudo -s

Disable root remote login

Modify the sshd configuration:

sudo nano /etc/ssh/sshd_config

to include the line

PermitRootLogin      no

Then restart the ssh server:

sudo service sshd restart

Install HW identification tools

wget http://pkgs.repoforge.org/lshw/lshw-2.16-1.el6.rf.i686.rpm
sudo yum install ./lshw-2.16-1.el6.rf.i686.rpm
sudo lshw > lshw_your.machine.id_20130417.txt

To do list

The following should still be documented:

Is restorecond service needed?
RAID and automatic file system checking / Ext4 journalling / write barriers
Check RAID IO scheduler error message that appears at boot-time
Fix hardware unsupported boot-time message
Fix ‘eth0: excessive work at interrupt’ for via_velocity driver
Compile and install a new kernel specific to my hardware
…

For the record