summaryrefslogtreecommitdiff
path: root/manual/en
diff options
context:
space:
mode:
authorLars Wirzenius <liw@liw.fi>2015-12-20 17:03:42 +0100
committerLars Wirzenius <liw@liw.fi>2015-12-20 17:03:42 +0100
commit0cecea8e2bcf421de715ccc200504dd2c1df9d53 (patch)
treeda75515804335c7a4f912fd47e914462ac27b572 /manual/en
parenta0eb69f2e54c8517f4bb79484b03fcc6fec0ca9e (diff)
downloadobnam-0cecea8e2bcf421de715ccc200504dd2c1df9d53.tar.gz
Add explanation of when Obnam de-dup works badly
Diffstat (limited to 'manual/en')
-rw-r--r--manual/en/060-backing-up.mdwn23
1 files changed, 23 insertions, 0 deletions
diff --git a/manual/en/060-backing-up.mdwn b/manual/en/060-backing-up.mdwn
index 16ccc5d2..1213918d 100644
--- a/manual/en/060-backing-up.mdwn
+++ b/manual/en/060-backing-up.mdwn
@@ -356,6 +356,29 @@ duplicate data is quite coarse (see the `--chunk-size` setting), and
so Obnam often doesn't find duplication when it exists, when the
changes are small.
+De-duplication isn't useful in the following scenarios:
+
+* A file changes such that things move around within the file. The
+ (current) Obnam de-duplication is based on non-overlapping chunks
+ from the beginning of a file. If some data is inserted, Obnam won't
+ notice that the chunks have shifted around. This can happen, for
+ example, for disk or ISO images.
+
+* Files with duplicate data that is not on a chunk boundary. For
+ example, emails with large attachments. Each email recipient gets
+ different `Received` headers, which shifts the body and attachments
+ by different amounts. As a result, Obnam won't notice the
+ duplication.
+
+* Data in compressed files, such as `.zip` or `.tar.xz` files. Obnam
+ doesn't know about the file compression, and only sees the
+ compressed version of the data. Thus, Obnam won'd de-duplicate it.
+
+A future version of Obnam will hopefully improve the de-duplication
+algorithms. If you see this optimistic paragraph in a version of Obnam
+released in 2017 or later, please notify the maintainers. Thank you.
+
+
De-duplication and safety against checksum collisions
-----------------------------------------------------