CCA 131 – Perform OS-level configuration for Hadoop installation

Note: This post is part of the CCA Administrator Exam (CCA131) objectives series

In the last post we have setup the local CDH repository. But before we can perform any installation or even before setting up the repositoy you must perform few OS-level configurations. The OS-level configuration includes:

  1. Enabling NTP
  2. Configuring Network Names (hostnames/FQDNs)
  3. Disabling SELinux
  4. Disabling the Firewall

We will cover each of the OS-level configurations in detail below. Make sure you follow all the steps given below to have a hassle-free CDH installation.

1. Enabling NTP

1. To install NTP on CentOS/RHEL 7 systems:

# yum install ntp

2. Edit the ntp configuration file /etc/ntp.conf and add the available NTP servers in your setup.

# vi /etc/ntp.conf
server 192.168.1.100
server 192.168.1.101
server 192.168.1.102

3. Start the ntpd service/target and enable ot to start automatically on boot.

# systemctl start ntpd
# systemctl enable ntpd
Note: In CentOS/RHEL 7 ntpd is replaced by chronyd as the default network time protocol daemon. ntpd is still included in the yum repository for customers who need to run an NTP service. We can run both NTP and chrony together and I have included the NTP configuration as per exam objectives of CCA 131

2. Configuring Network Names (hostnames/FQDNs)

It is important that all the nodes in the CDH cluster should communicate with each other and their FQDNs are fully resolvable to their respective IPs.

1. Add the IP address of each node of the CDH cluster and their respective FQDN in the file /etc/hosts:

# vi /etc/hosts
192.168.1.10 master.localdomain
192.168.1.11 node01.localdomain
192.168.1.12 node02.localdomain
192.168.1.13 node03.localdomain
Note: If you are using DNS, storing this information in /etc/hosts is not required, but it is good practice.

2. Verify that you can get the FQDN of each node of the cluster using the command:

# hostname -f

3. Disabling SELinux

1. To get the current status of the SELinux:

# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      31
# getenforce
Enforcing

In my case, SELinux is enabled and is in enforcing mode as well. You can skip disabling SELinux if the mode is “permissive“.

2. Edit the /etc/selinux/config file and change parameter value “SELINUX=enforcing” to “SELINUX=permissive”.

# vi /etc/sysconfig/selinux
selinux=permissive

3. You can either reboot the system or execute the below command for the changes to take effect immediately.

# setenforce 0

I usually disable the SELinux using “SELINUX=disabled” and reboot the system after all the OS-level configuration is completed.

4. Verify the status again:

# getenforce
Permissive

4. Disabling the Firewall

Last but not least, disable the firewall on the system. The default firewall is CentOS/RHEL 6 is iptables whereas in CentOS/RHEL 7 we use firewalld.

For CentOS/RHEL 6

# chkconfig iptables off
# service iptables stop

For CentOS/RHEL 7

# systemctl disable firewalld
# systemctl stop firewalld

Other Recommended Settings

Along with the above 4 mentioned OS-level configurations, it is recommended to disable “transparent hugepage” and setting “vm.swappiness” to recommended value by Cloudera.

Verify if THP is enabled

Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned.

To verify if THP enabled or disabled:

# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
Note : Transparent Huge Pages cannot be enabled/disabled on a running machine and requires a reboot.

Disabling THP

1. Add the “transparent_hugepage=never” kernel parameter option to the grub2 configuration file. Append or change the “transparent_hugepage=never” kernel parameter on the GRUB_CMDLINE_LINUX option in /etc/default/grub file.

# vi /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="nomodeset crashkernel=auto rd.lvm.lv=vg_os/lv_root rd.lvm.lv=vg_os/lv_swap rhgb quiet transparent_hugepage=never"
GRUB_DISABLE_RECOVERY="true"

2. Rebuild the /boot/grub2/grub.cfg file by running the grub2-mkconfig -o command. Before rebuilding the GRUB2 configuration file, ensure to take a backup of the existing /boot/grub2/grub.cfg.

# grub2-mkconfig -o /boot/grub2/grub.cfg

3. Reboot the system and verify option are in effect.

# shutdown -r now

4. Verify the parameter is set correctly

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-514.10.2.el7.x86_64 root=/dev/mapper/vg_os-lv_root ro nomodeset crashkernel=auto rd.lvm.lv=vg_os/lv_root rd.lvm.lv=vg_os/lv_swap rhgb quiet transparent_hugepage=never LANG=en_US.UTF-8

For more info on disablig THP, refer the below posts:

VM swappiness

Swappiness is a property for the Linux kernel that changes the balance between swapping out runtime memory, as opposed to dropping pages from the system page cache. Swappiness can be set to values between 0 and 100, inclusive. A low value means the kernel will try to avoid swapping as much as possible where a higher value instead will make the kernel aggressively try to use swap space.

1. Cloudera recommends to set the value of swapiness equal to or below 10. To view the current value of swapiness:

# grep vm.swappiness /usr/lib/tuned/virtual-guest/tuned.conf 
vm.swappiness = 30

2. Let’s set the value of vm.swapiness to 10.

# echo "vm.swappiness = 10" > /usr/lib/tuned/virtual-guest/tuned.conf

3. Verify the value of vm.swapness again.

# grep vm.swappiness /usr/lib/tuned/virtual-guest/tuned.conf 
vm.swappiness = 10
Related Post