From 3490e261887aa4e209164e817a1f72efa7a0a894 Mon Sep 17 00:00:00 2001 From: Erich Ritz Date: Fri, 1 Apr 2022 11:46:18 +0700 Subject: system/duperemove: Added (Find duplicate extents). Signed-off-by: Willy Sudiarto Raharjo --- system/duperemove/README | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 system/duperemove/README (limited to 'system/duperemove/README') diff --git a/system/duperemove/README b/system/duperemove/README new file mode 100644 index 0000000000000..8cb82b560ce4f --- /dev/null +++ b/system/duperemove/README @@ -0,0 +1,20 @@ +Duperemove is a simple tool for finding duplicated extents and +submitting them for deduplication. When given a list of files it will +hash their contents on a block by block basis and compare those hashes +to each other, finding and categorizing blocks that match each other. +When given the -d option, duperemove will submit those extents for +deduplication using the Linux kernel extent-same ioctl. + +Duperemove can store the hashes it computes in a 'hashfile'. If given an +existing hashfile, duperemove will only compute hashes for those files +which have changed since the last run. Thus you can run duperemove +repeatedly on your data as it changes, without having to re-checksum +unchanged data. + +Duperemove can also take input from the fdupes program. + +Deduplication is currently only supported by the btrfs and xfs +filesystems. + +fdupes is an optional runtime dependency (allows the use of the --fdupes +command line option). -- cgit v1.2.3