HDPCA Exam Objective – Create a home directory for a user and configure permissions

Note: This is post is part of the HDPCA exam objective series

HDFS

HDFS (Hadoop Distributed File System) is the storage layer of the Hadoop cluster which stores the data. It is a distributed filesystem and it is very important for a Hadoop admin to know how to configure and manager HDFS inside out. For the purpose of the exam, we will see few of the basic commands to administer HDFS. This includes creating directories, managing owner & permissions, loading data into HDFS and copy data from HDFS into local filesystem.

Navigating HDFS

There are several ways to navigate through HDFS. I have listed out the most commonly used ways.

1. Using the NameNode Web UI

You can view the files on HDFS using the NameNode web UI which is located at – http://[namenode]:50070. The web UI dashboard looks like below.

You can go to utilities > Browser the filesystem and browse through all the directories.

You can not upload anything from the Namenode UI to the HDFS filesystem, but you can download files to your local system using the UI.

2. Using Ambari

The Swiss army knife – ambari can be used to view files, download them, upload new files, create new directories etc. Use the ambari server “files view” to go to the HDFS filesystem view.

3. Using the command line

HDFS provides us a way to manipulate the data stored in it with a filesystem shell using the command “hdfs dfs“. Few of the examples of using the command are demonstrated below.

1. To list the contents of a directory:

[hdfs@dn1 ~]$ hdfs dfs -ls /user
Found 2 items
drwxrwx---   - ambari-qa hdfs          0 2018-07-15 12:44 /user/ambari-qa
drwxr-xr-x   - hbase     hdfs          0 2018-07-16 09:26 /user/hbase

2. Creating a new directory:

[hdfs@dn1 ~]$ hdfs dfs -mkdir /test
[hdfs@dn1 ~]$ hdfs dfs -ls /
Found 9 items
drwxrwxrwx   - yarn   hadoop          0 2018-07-15 12:43 /app-logs
drwxr-xr-x   - hdfs   hdfs            0 2018-07-16 09:26 /apps
drwxr-xr-x   - yarn   hadoop          0 2018-07-15 12:40 /ats
drwxr-xr-x   - hdfs   hdfs            0 2018-07-15 12:40 /hdp
drwxr-xr-x   - mapred hdfs            0 2018-07-15 12:40 /mapred
drwxrwxrwx   - mapred hadoop          0 2018-07-15 12:41 /mr-history
drwxr-xr-x   - hdfs   hdfs            0 2018-07-16 20:23 /test

3. Copy files within HDFS:

$ hdfs dfs -cp file1 file2

Creating a User and configuring home directory for a user in HDFS

I wanted to give you all a brief on what are the options available when it comes to working with HDFS as a filesystem. In the exam, you will be only asked to create a user and configure the user directory into HDFS and move some data to and from HDFS.

1. Create a local user

First, create a local user which by default will have a home directory into the local filesystem. Login as root on any of the datanodes and use the Linux command “useradd” to create a new user test.

# useradd test

Verify the user creation and also view the home directory of the user.

# id test
uid=1009(test) gid=1009(test) groups=1009(test)
# su - test
$ pwd
/home/test

2. Create home directory for user into HDFS

Create a user home directory in HDFS using the “hdfs dfs” command. You have to be the “hdfs” user to do so.

$ hdfs dfs -mkdir /user/test
$ hdfs dfs -ls /user
Found 3 items
drwxrwx---   - ambari-qa hdfs          0 2018-07-15 12:44 /user/ambari-qa
drwxr-xr-x   - hbase     hdfs          0 2018-07-16 09:26 /user/hbase
drwxr-xr-x   - hdfs      hdfs          0 2018-07-16 20:38 /user/test

3. Change the ownership of the home directory in HDFS

The default ownership of the directories you create in HDFS will be hdfs:hdfs. For the new “test” user to use this directory, we need to first change the ownership to test:test.

$ hdfs dfs -chown test:test /user/test

Verify the new permissions again:

$ hdfs dfs -ls /user
Found 3 items
drwxrwx---   - ambari-qa hdfs          0 2018-07-15 12:44 /user/ambari-qa
drwxr-xr-x   - hbase     hdfs          0 2018-07-16 09:26 /user/hbase
drwxr-xr-x   - test      test          0 2018-07-16 20:38 /user/test

You can also use the seprate command to change user ownership and group ownership as shown below:

$ hdfs dfs -chown test /user/test
$ hdfs dfs -chgrp test /user/test

4. Change the permissions of the home directory

Similar to changing ownership, you can change the permissions of the home directory of any file in HDFS for that matter. To view the current permissions:

$ hdfs dfs -ls /user
Found 3 items
drwxrwx---   - ambari-qa hdfs          0 2018-07-15 12:44 /user/ambari-qa
drwxr-xr-x   - hbase     hdfs          0 2018-07-16 09:26 /user/hbase
drwxr-xr-x   - test      test          0 2018-07-16 20:38 /user/test

Let’s try changing the permission of the directory “/user/test” to “660”.

$ hdfs dfs -chmod 660 /user/test

Verify the new permissions again:

$ hdfs dfs -ls /user
Found 3 items
drwxrwx---   - ambari-qa hdfs          0 2018-07-15 12:44 /user/ambari-qa
drwxr-xr-x   - hbase     hdfs          0 2018-07-16 09:26 /user/hbase
drw-rw----   - test      test          0 2018-07-16 20:38 /user/test

Copying files to and from HDFS filesystem

To verify if we have performed all the steps properly, we can create a file locally and upload it to HDFS and vice versa. Let create a file locally first using the hdfs user.

$ touch /home/hdfs/test_file

Now, copy this file to the HDFS home directory of user hdfs. If you want to copy a file from any other location, make sure the parent directory has the permission of 755.

$ hdfs dfs -put /home/hdfs/test_file /user/test/

Verify:

$ hdfs dfs -ls /user/test
Found 1 items
-rw-r--r--   2 hdfs test          0 2018-07-16 22:31 /user/test/test_file
Note: The command “hdfs dfs -put” and “hdfs dfs -copyFromLocal” are equivalent and can be used interchangeably

Let’s now copy the same file from HDFS to local filesystem.

$ hdfs dfs -get /user/test/test_file /tmp/

Verify:

$ ls -l /tmp/test_file
-rw-r--r--. 1 hdfs hadoop 0 Jul 16 22:38 /tmp/test_file

The command “hdfs dfs -get” and “hdfs dfs -CopyToLocal” are equivalent and can be used interchangeably.

Related Post