• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Geek Diary

CONCEPTS | BASICS | HOWTO

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • Linux Services
    • VCS
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
    • Data Guard
  • DevOps
    • Docker
    • Shell Scripting
  • Interview Questions
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

HDPCA Exam Objective – Recover a snapshot

By admin

Note: This is post is part of the HDPCA exam objective series

We mentioned earlier that HDFS replication alone is not a suitable backup strategy. In the Hadoop 2 filesystem, snapshots have been added, which brings another level of data protection to HDFS. As changes to the filesystem are made, any change that would affect the snapshot is treated specially. For example, if a file that exists in the snapshot is deleted then, even though it will be removed from the current state of the filesystem, its metadata will remain in the snapshot, and the blocks associated with its data will remain on the filesystem though not accessible through any view of the system other than the snapshot.

We can recover a snapshot in HDFS to rollback to the desired system state in case of a data loss or corruption. As a part of the exam objective, we will create a snapshot and try to perform a recovery of the snapshot in this post.

1. Create a snapshot

1. Let’s first cerate a snapshot on a snapshottable directory. If the directory is not snapshottable, you can allow snapshot using the command:

$ hdfs dfsadmin -allowSnapshot /user/test
Allowing snapshot on test succeeded

2. Create a snapshot of the directory “/user/test” with snapshot_latest as the name of the snapshot.

$ hdfs dfsadmin -createSnapshot /user/test snapshot_latest
Created snapshot /user/test/.snapshot/snapshot_latest

3. View the snapshot in the .snapshot directory.

$ hdfs dfs -ls /user/test/.snapshot
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2018-07-21 10:16 /user/test/.snapshot/snapshot_latest

2. Delete a file

Now, delete any file from the /user/test directory in HDFS.

$ hdfs dfs -ls /user/test
Found 2 items
-rw-r--r--   3 hdfs hdfs         27 2018-07-21 10:34 /user/test/another_test
-rw-r--r--   3 hdfs hdfs         21 2018-07-21 10:10 /user/test/test_file
$ hdfs dfs -rm /user/test/test_file
18/07/21 11:06:40 INFO fs.TrashPolicyDefault: Moved: 'hdfs://geeklab/user/test/test_file' to trash at: hdfs://geeklab/user/hdfs/.Trash/Current/user/test/test_file
Note the mention of trash directories; by default, HDFS will copy any deleted files into a .Trash directory in the user’s home directory, which helps to defend against slipping fingers. These files can be removed through “hdfs dfs -expunge” or will be automatically purged in 7 days by default.

Verif that the file is not present.

$ hdfs dfs -ls /user/test/test_file
ls: `/user/test/test_file': No such file or directory

3. Recover the snapshot

1. You can restore the delete file from the /user/test/.snapshot directory which still has the copy of the test_file present.

$ hdfs dfs -ls /user/test/.snapshot/snapshot_latest
Found 1 items
-rw-r--r--   3 hdfs hdfs         21 2018-07-21 10:10 /user/test/.snapshot/snapshot_latest/test_file
$ hdfs dfs -cat /user/test/.snapshot/snapshot_latest/test_file
This is a test file.

2. Lets copy the removed file from snapshot directory to the original location of the file.

$ hdfs dfs -cp /user/test/.snapshot/snapshot_latest/test_file /user/test/

Verify:

$ hdfs dfs -ls /user/test/test_file
-rw-r--r--   3 hdfs hdfs         21 2018-07-21 11:22 /user/test/test_file
HDPCA Exam Objective – Create a snapshot of an HDFS directory

Filed Under: Hadoop, HDPCA, Hortonworks HDP

Some more articles you might also be interested in …

  1. HDPCA Exam Objective – Add an HDP service to a cluster using Ambari
  2. HDPCA Exam Objective – Configure NameNode HA
  3. How to configure Capacity Scheduler Queues Using YARN Queue Manager
  4. CCA 131 – Install CDH using Cloudera Manager
  5. CCA 131 – Rebalance the cluster
  6. HDPCA Exam Objective – Configure and manage alerts
  7. HDPCA Exam Objective – Change the configuration of a service using Ambari
  8. HDPCA Exam Objective – Add a new node to an existing cluster
  9. HDPCA Exam Objective – Restart an HDP service
  10. CCA 131 – Commission/decommission a node

You May Also Like

Primary Sidebar

Recent Posts

  • MySQL: how to figure out which session holds which table level or global read locks
  • Recommended Configuration of the MySQL Performance Schema
  • MySQL: Identify what user and thread are holding on to a meta data lock that is preventing other queries from running
  • MySQL: How to kill a Long Running Query using max_execution_time
  • Archives
  • Contact Us
  • Copyright

© 2021 · The Geek Diary