Note: This is post is part of the HDPCA exam objective series Hive first started with HiveServer1. However, this version of the Hive server was not very stable. It sometimes suspended or blocked clients' connection quietly. Since version 11, Hive includes a new Hive server called HiveSever2 as an addition to HiveServer1. HiveServer2 is an enhanced Hive server designed for multiclient concurrency and improved authentication. HiveServer2 also supports Beeline as the alternative command-line … [Read more...] about HDPCA Exam Objective – Configure HiveServer2 HA ( Part 2 – Configure HA )
Archives for July 2018
How to configure Capacity Scheduler Queues Using YARN Queue Manager
Note: This is post is part of the HDPCA exam objective series Capacity Scheduler is mainly designed for multitenancy, where multiple organizations collectively fund the cluster based on the computing needs. There is an added benefit that an organization can access any excess capacity not being used by others. This provides elasticity for the organizations in a cost-effective manner. In the previous post, we have seen installing and configuring capacity scheduler. The fundamental unit of … [Read more...] about How to configure Capacity Scheduler Queues Using YARN Queue Manager
How to Create HDFS policies in Ranger
Note: This is post is part of the HDPCA exam objective series Apache Ranger is an application that enables data architects to implement security policies on a big data ecosystem. The goal of this project is to provide a unified way for all Hadoop applications to adhere to the security guidelines that are defined. Here are some of the features of Apache Ranger: Centralized administration Fine grained authorization Standardized authorization Multiple authorization methods Centralized … [Read more...] about How to Create HDFS policies in Ranger
How to Configure Hive Authorization Using Apache Ranger
Note: This is post is part of the HDPCA exam objective series Apache Ranger is a framework for enabling, monitoring, and managing the comprehensive data security across the Hadoop platform. Ranger simply helps a Hadoop admin with various security management tasks. It provides a mechanism to manage the security from a single pane for various components. With Ranger, you can control fine-grained access to various components of the Hadoop ecosystem. Ranger has an Administration Portal you can … [Read more...] about How to Configure Hive Authorization Using Apache Ranger
HDPCA Exam Objective – Configure HiveServer2 HA ( Part 1 – Installing HiveServer )
Note: This is post is part of the HDPCA exam objective series It is important to configure high availability in production so that if one of the hiveserver2 fails, the others can respond to client requests. This can be achieved by using the ZooKeeper discovery mechanism to point the clients to the active Hive servers. Adding HiveServer2 Before configuring the HiveServer2 HA, lets first add the HiveServer2 service using ambari. Pre requsites We need to have a database named hive … [Read more...] about HDPCA Exam Objective – Configure HiveServer2 HA ( Part 1 – Installing HiveServer )
HDPCA Exam Objective – View an application’s log file (Troubleshoot a failed job)
Note: This is post is part of the HDPCA exam objective series It is an integral part of Haddop administration to troubleshoot running or failed jobs. In order to troubleshoot a running/failed job, we must view the application’s log file. This post focuses on the HDPCA exam objective "View an application’s log file". We will run a sample map reduce program and view the status of the program using the command line and ResourceManager UI. Running an Example job The HDP installation comes … [Read more...] about HDPCA Exam Objective – View an application’s log file (Troubleshoot a failed job)
HDPCA Exam Objective – Configure and manage alerts
Note: This is post is part of the HDPCA exam objective series Monitoring the health of Hadoop cluster is an important aspect of Hadoop administration. Ambari provides us the centralized management of health alerts and checks for the services in your cluster. You can set thresholds and can disable/enable alerts using the ambari UI. You can view all the alerts definations from the alerts page in the ambari dashboard. The alerts which have breached the threshold will be shown in red … [Read more...] about HDPCA Exam Objective – Configure and manage alerts
HDPCA Exam Objective – Install and configure Knox
Note: This is post is part of the HDPCA exam objective series Knox Basics Knox Gateway is another Apache project that addresses the concern of secured access to the Hadoop cluster from corporate networks. Knox Gateway provides a single point-to-point of authentication and access for Apache Hadoop services in a cluster. Knox runs as a cluster of servers in the DMZ zone isolating the Hadoop cluster within the corporate network. The key feature of Knox Gateway is that it provides perimeter … [Read more...] about HDPCA Exam Objective – Install and configure Knox
HDPCA Exam Objective – Configure the Capacity Scheduler
Note: This is post is part of the HDPCA exam objective series YARN Schedulers The Hadoop YARN scheduler is responsible for assigning resources to the applications submitted by users. There are 3 types of schedulers in YARN. First in First out (FIFO) (Hadoop 1.x) Fair scheduler Capacity scheduler First in First out (FIFO) By default, YARN supports a First in First out (FIFO) scheduler, which executes jobs in the same order as they arrive using a queue of jobs. However, FIFO … [Read more...] about HDPCA Exam Objective – Configure the Capacity Scheduler
HDPCA Exam Objective – Configure HDFS ACLs
Note: This is post is part of the HDPCA exam objective series Starting from Haddop 2.4, HDFS can be configured with ACLs. These ACLs work very much the same way as extended ACLs in a Unix environment. This allows files and directories in HDFS to have more permissions than the basic POSIX permissions. To verify if you have already set the value, go to services > HDFS > config and search for the property "dfs.namenode.acls.enabled" in the search box. Enabling HDFS ACLs To use HDFS … [Read more...] about HDPCA Exam Objective – Configure HDFS ACLs