How to find all the sparse files in Linux

Sparse files are files that have large amounts of space preallocated to them, without occupying the entire amount from the filesystem. They are useful for reducing the amount of time and disk space involved in creating loop filesystems or large disk images for virtualized guests, among other things. The term “sparse file” is used to mean one containing “holes”; it is easy to recognize one on a running system because its disk usage is less than its size. We can see this behavior with /var/log/lastlog file.

# ls -lh /var/log/lastlog
-rw-r--r--. 1 root root 286K Dec  3 04:50 /var/log/lastlog
# du -sh /var/log/lastlog
12K     /var/log/lastlog

Finding sparse files

Now the above process can identify the sparse files in the system, but it becomes cumbersome to find all the sparse files in a filesystem or directory, especially when they are many. Don’t worry, there is an option in find command which helps us to find all the sparse files in one go. Let’s see an example below.

1. Use find command with “%S” to find each file’s sparseness.

# find /var/log -type f -printf "%S\t%p\n"
# find /var/log -type f -printf "%S\t%p\n"
1       /var/log/tallylog
1.00095 /var/log/audit/audit.log.1
0.0419982       /var/log/lastlog
....

2. Value displayed in the leftmost column is (BLOCK-SIZE*st_blocks / st_size) which is normally less than 1.0 in case of sparse file.

3. If you want to find all sparse files on the system, we can filter out all the files with the leftmost column values less than 1.

# find / -type f -printf "%S\t%p\n" | gawk '$1 
Related Post