aboutsummaryrefslogtreecommitdiff
path: root/system/bees/README
diff options
context:
space:
mode:
Diffstat (limited to 'system/bees/README')
-rw-r--r--system/bees/README26
1 files changed, 15 insertions, 11 deletions
diff --git a/system/bees/README b/system/bees/README
index 88041ffa13..76d510e184 100644
--- a/system/bees/README
+++ b/system/bees/README
@@ -1,27 +1,31 @@
bees (Best-Effort Extent-Same) is a block-oriented userspace
-deduplication agent designed for large btrfs filesystems. It is an
-offline dedupe combined with an incremental data scan capability to
-minimize time data spends on disk from write to dedupe.
+deduplication agent designed to scale up to large btrfs filesystems.
+It is an offline dedupe combined with an incremental data scan
+capability to minimize time data spends on disk from write to dedupe.
Strengths:
- * Space-efficient hash table and matching algorithms - can use as
- little as 1 GB hash table per 10 TB unique data (0.1GB/TB)
- * Daemon incrementally dedupes new data using btrfs tree search
+ * Space-efficient hash table - can use as little as 1 GB hash table
+ per 10 TB unique data (0.1GB/TB)
+ * Daemon mode - incrementally dedupes new data as it appears
+ * Largest extents first - recover more free space during fixed
+ maintenance windows
* Works with btrfs compression - dedupe any combination of compressed
and uncompressed files
- * Works around btrfs filesystem structure to free more disk space
+ * Whole-filesystem dedupe - scans data only once, even with snapshots
+ and reflinks
* Persistent hash table for rapid restart after shutdown
- * Whole-filesystem dedupe - including snapshots
* Constant hash table size - no increased RAM usage if data set
becomes larger
* Works on live data - no scheduled downtime required
- * Automatic self-throttling based on system load
+ * Automatic self-throttling - reduces system load
+ * btrfs support - recovers more free space from btrfs than naive
+ dedupers
Weaknesses:
* Whole-filesystem dedupe - has no include/exclude filters, does not
accept file lists
- * Requires root privilege (or CAP_SYS_ADMIN)
- * First run may require temporary disk space for extent reorganization
+ * Requires root privilege (`CAP_SYS_ADMIN` plus the usual filesystem
+ read/modify caps)
* First run may increase metadata space usage if many snapshots exist
* Constant hash table size - no decreased RAM usage if data set
becomes smaller