Tutorial on Linux Clustering (High Availability)

Concept of Clustering

The concept of a cluster is that the cluster itself appears on the outside as a single system. A cluster consists of two or more Real Computers referred to as nodes or members of a cluster. The components of a cluster are commonly, but not always, connected to each other through fast Local Area Networks.

Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability. Clustering is all about the back-end operations being performed by the nodes or members which appear to the outside world as a single computational entity.

Type of Clusters

High Performance Clusters
High Availability Clusters
Load Balancing Clusters
Storage Clusters

High Performance Clusters (HPC)

Multiple nodes in the cluster perform concurrent calculations. There are two key benefits of High Performance (or grid) computing:

Resilience: As long as even a single member of a cluster is running, services continue to be provided by the cluster.
Increased Capacity: The more nodes added to the cluster, the more computing horsepower is available and therefore very powerful computers can be built using commodity hardware.

High Availability Clusters (HA)

High-availability clusters provide continuous availability of services by eliminating single points of failure and by failing over services from one cluster node to another in case a node becomes inoperative. High-availability clusters are sometimes referred to as failover clusters. Red Hat Cluster Suite provides high-availability clustering through its High-availability Service Management component.

Load Balancing Clusters

Load Balancing Clusters operate by having all workload come through one or more load balancing front-ends, which then distribute it to a collection of back end servers. If a node in a load-balancing cluster becomes inoperative, the load balancing software detects the failure and redirects requests to other cluster nodes. Red Hat Cluster Suite provides load-balancing through LVS (Linux Virtual Server).

Storage Clusters

Storage clusters provide a consistent file system image across servers in a cluster, allowing the servers to simultaneously read and write to a single shared file system. With a cluster-wide file system, a storage cluster eliminates the need for redundant copies of application data and simplifies backup and disaster recovery. Red Hat Cluster Suite provides storage clustering through Red Hat GFS(Global File System).

Why we need HA clusters?

24×7 Mission Critical Services have the following requirements:

Scalability: When workload increases, the system must scale up to meet the requirements.
Availability: The service must always be on and available, despite hardware and software failures.
Cost-effective: The whole system must be economical to build and expand.
Manageability: Although the whole system may be big in physical size, it should be easy to manage.

Understanding HA Clustering

HA Clustering often uses the following terms:

Active/Active Clustering
Active/Passive Clustering
Failover Clustering
Failsafe Clustering

The terms Active/Active & Active/Passive Clustering mean different things to different people. It is better to use Failover and Failsafe terms to describe an HA Cluster.

Failover Clusters: Multiple members (nodes) can be a part of the cluster One or more services is (are) active on a given member at any given time Upon failure, the services and associated resources fail over to other member(s) in the cluster During this failover, depending on the application and service structure, end users might experience a session break

Failsafe Clusters: Multiple members (nodes) can be a part of the cluster Services are active on all members Upon failure, the cluster infrastructure simply stops sending requests to the failed node and directs these to the active node.

Active Passive Terminology

Active/Passive Terminology generally refers to failover clusters running only one service. Active/Active Terminology can be used to refer to Failover clusters with multiple servers such that all cluster members are hosting at least one service.