diff options
Diffstat (limited to 'system/bees/README')
-rw-r--r-- | system/bees/README | 26 |
1 files changed, 15 insertions, 11 deletions
diff --git a/system/bees/README b/system/bees/README index 88041ffa13..76d510e184 100644 --- a/system/bees/README +++ b/system/bees/README @@ -1,27 +1,31 @@ bees (Best-Effort Extent-Same) is a block-oriented userspace -deduplication agent designed for large btrfs filesystems. It is an -offline dedupe combined with an incremental data scan capability to -minimize time data spends on disk from write to dedupe. +deduplication agent designed to scale up to large btrfs filesystems. +It is an offline dedupe combined with an incremental data scan +capability to minimize time data spends on disk from write to dedupe. Strengths: - * Space-efficient hash table and matching algorithms - can use as - little as 1 GB hash table per 10 TB unique data (0.1GB/TB) - * Daemon incrementally dedupes new data using btrfs tree search + * Space-efficient hash table - can use as little as 1 GB hash table + per 10 TB unique data (0.1GB/TB) + * Daemon mode - incrementally dedupes new data as it appears + * Largest extents first - recover more free space during fixed + maintenance windows * Works with btrfs compression - dedupe any combination of compressed and uncompressed files - * Works around btrfs filesystem structure to free more disk space + * Whole-filesystem dedupe - scans data only once, even with snapshots + and reflinks * Persistent hash table for rapid restart after shutdown - * Whole-filesystem dedupe - including snapshots * Constant hash table size - no increased RAM usage if data set becomes larger * Works on live data - no scheduled downtime required - * Automatic self-throttling based on system load + * Automatic self-throttling - reduces system load + * btrfs support - recovers more free space from btrfs than naive + dedupers Weaknesses: * Whole-filesystem dedupe - has no include/exclude filters, does not accept file lists - * Requires root privilege (or CAP_SYS_ADMIN) - * First run may require temporary disk space for extent reorganization + * Requires root privilege (`CAP_SYS_ADMIN` plus the usual filesystem + read/modify caps) * First run may increase metadata space usage if many snapshots exist * Constant hash table size - no decreased RAM usage if data set becomes smaller |