duperemove: command not found

duperemove utility finds duplicate filesystem extents and optionally schedule them for deduplication. An “extent” is a small part of a file that is stored within a filesystem. On some filesystems, one extent can be referenced multiple times when the contents of different files are identical. This is known as “extent sharing” or “deduplication” and it can save disk space by eliminating the need to store multiple copies of the same data.

The duperemove command can be used to find and remove these duplicate extents, which can help to free up disk space. It works by comparing the contents of files and identifying extents that are identical. Once duplicate extents are found, the command can be used to schedule them for deduplication, which will remove the duplicate extents and free up disk space.

If you encounter the below error while running the command duperemove:

duperemove: command not found

you may try installing the below package as per your choice of distribution:

Distribution Command
Debian apt-get install duperemove
Ubuntu apt-get install duperemove
Kali Linux apt-get install duperemove
Fedora dnf install duperemove

duperemove Command Examples

1. Search for duplicate extents in a directory and show them:

# duperemove -r path/to/directory

2. Deduplicate duplicate extents on a Btrfs or XFS (experimental) filesystem:

# duperemove -r -d path/to/directory

3. Use a hash file to store extent hashes (less memory usage and can be reused on subsequent runs):

# duperemove -r -d --hashfile=path/to/hashfile path/to/directory

4. Limit I/O threads (for hashing and dedupe stage) and CPU threads (for duplicate extent finding stage):

# duperemove -r -d --hashfile=path/to/hashfile --io-threads=N --cpu-threads=N path/to/directory
Related Post