HDPCA Exam Objective – Configure HiveServer2 HA ( Part 1 – Installing HiveServer )

Note: This is post is part of the HDPCA exam objective series

It is important to configure high availability in production so that if one of the hiveserver2 fails, the others can respond to client requests. This can be achieved by using the ZooKeeper discovery mechanism to point the clients to the active Hive servers.

Adding HiveServer2

Before configuring the HiveServer2 HA, lets first add the HiveServer2 service using ambari.

Pre requsites

We need to have a database named hive created before we start the service addition for Hive2. I will create a database using MariaDB. You can also create the database in Oracle, MySQL or Postgres if you want.

1. As the MariaDB Database Management System (DBMS) is not installed by default on RHEL/CentOS 7, we will start this recipe by installing the required packages.

# yum install -y mariadb mariadb-server

2. Enable the service to ensure the service starts at boot and start the mariadb service:

# systemctl enable mariadb
# systemctl start mariadb

3. Create the database “hive” using the root user.

# mysql -u root -p
MariaDB [(none)]> CREATE DATABASE hive;
Query OK, 1 row affected (0.04 sec)

4. Create a new user “hiveuser” and grant full priviledges to the hiveuser on the hive database objects.

MariaDB [(none)]> GRANT ALL ON hive.* TO 'hiveuser'@'localhost' IDENTIFIED BY 'password';
Query OK, 0 rows affected (0.01 sec)

5. Connect to the database using the new user to verify:

# mysql -u hiveuser -p

list the databases available:

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| test               |
+--------------------+
3 rows in set (0.04 sec)

Adding HiveServer service using Ambari

1. To add the HiveServer Service, goto Services page and select “Add Service” under the “Actions” drop-down.

2. This will start the Add Service Wizard. Select the services “HiveServer”, “Tez”, “Pig” and “Slider”. The other 3 services are dependencies of the HiveServer and thus compulsary to be installed. Even if you do not select them the wizard will add it for you.

3. On the “Assign Masters” page assign the roles “HiveServer2” and “Hive Metastore” to the cluster nodes. The role “WebHCat Server” is automatically assigned to the cluster node with “HiveServer2” role.

4. On the “Assign Slaves and Clients”, Select the client cluster nodes to nstall Tez Client, HCat Client, Hive Client, Pig Client and Slider Client.

5. On the “Customize Services” page, we will have to provide the database login details which we have created as a part of prerequisites. Provide Below information:

  • Database Name : hive
  • Database Username : hiveuser
  • Database Password : password
  • Database URL : jdbc:mysql://ambari-server.localdomain/hive (I have created the MariaDB database on the ambari server )

We will have to download the MySQL Connector/JDBC driver from the MySQL site and install it in the Ambari server which hosts our MariaDB database.

https://dev.mysql.com/downloads/connector/j/

Select the appropriate OS and download and install the mysql connector in the ambari-server.

# yum install mysql-connector-java-8.0.12-1.el7.noarch.rpm

Once the MySQL connector is installed, execute the below command on the ambari-server:

# ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java-8.0.12.jar
Using python  /usr/bin/python
Setup ambari-server
Copying /usr/share/java/mysql-connector-java-8.0.12.jar to /var/lib/ambari-server/resources
If you are updating existing jdbc driver jar for mysql with mysql-connector-java-8.0.12.jar. Please remove the old driver jar, from all hosts. Restarting services that need the driver, will automatically copy the new jar to the hosts.
JDBC driver was successfully initialized.
Ambari Server 'setup' completed successfully.

We can now test the connection to the MariaDB database and it should go through successfully.

6. On the next page, ambari wizard will prompt us for some recommended property changes to be done for the HiveServer Install. I will keep the changes intact and proceed further.

7. Review the configuration before you begin the installation.

8. The installation will begin now. The wizard will install all the required components, start the services and even test the started services as a part of the installation.

9. Review the summary page and complete the installation. You will have to restart a few services such as HDFS, MapReduce2, YARN etc to complete the installation.

Verify if all the components of the service are up and running.

Related Post