diff options
Diffstat (limited to 'faq/checksum-safety.mdwn')
-rw-r--r-- | faq/checksum-safety.mdwn | 35 |
1 files changed, 0 insertions, 35 deletions
diff --git a/faq/checksum-safety.mdwn b/faq/checksum-safety.mdwn deleted file mode 100644 index cab4802..0000000 --- a/faq/checksum-safety.mdwn +++ /dev/null @@ -1,35 +0,0 @@ -[[!meta title="Checksum collisions and safety"]] - -Obnam is using the MD5 checksum algorithm for recognising duplicate -data chunks. MD5 has a reputation for being unsafe: people have -constructed files that are different, but result in the same MD5 -checksum. This is true. - -Every checksum algorithm can have collisions. Changing Obnam to, say, -SHA1, SHA2, or the as yet unreleased SHA3 would not remove the chance -of collisions. It would reduce the chance of accidental collisions, -but the chance of those is already so small with MD5 that it can be -disregarded. Or put in another way, if you care about the chance of -accidental MD5 collisions, you should be caring about accidental SHA1, -SHA2, or SHA3 collisions as well. - -Apart from accidental collisions, there are two cases where you should -worry about checksum collisions (regardless of algorithm). - -First, if you're into researching checksum collisions, you're likely -to have files that cause checksum collisions, and in that case, if you -restore after a catastrophe, you probably want to get the files back -intact, rather having Obnam confuse one with the other. - -Second, if you have an enemy who wishes to corrupt your backed up -data, they may replace some of the backed up data with other data that -has the same checksum. This way, when you restore, your data is -corrupted without Obnam noticing. - -For both of these cases, you can instruct Obnam to **verify** that -chunks of data with the same checksum actually are the same data, -instead of relying on the checksum alone. This is as safe as it can -be, but it has a big performance impact. It causes Obnam to have to -read from the repository (possibly downloading it from your backup -server) all the data you are backing up. You'll still benefit from the -de-duplication, however, so your repository size will be smaller. |