From 82bd3888547f1745b28e50f27345a6a2a6c22ecf Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Thu, 25 Feb 2021 11:57:10 +0200 Subject: doc: add plan for using encryption --- obnam.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/obnam.md b/obnam.md index 240dcf0..a7ba178 100644 --- a/obnam.md +++ b/obnam.md @@ -876,6 +876,75 @@ restores all the files in the SQLite database. +## Encryption and authenticity of chunks + +*This is a plan that will be implemented soon. When it has been, this +section needs to be updated to to use present tense.* + +Obnam encrypts data it stores on the server, and checks that the data +it retrieves from the server is what it stored. This is all done in +the client: the server should never see any data isn't encrypted, and +the client can't trust the server to validate anything. + +Obnam will be using _Authenticated Encryption with Associated Data_ or +[AEAD][]. AEAD both encrypts data, and validates it before decrypting. +AEAD uses two encryption keys, one algorithm for symmetric encryption, +and one algorithm for a message authentication codes or [MAC][]. AEAD +encrypts the plaintext with a symmetric encryption algorithm using the +first key, giving a ciphertext. It then computes a MAC of the +ciphertext using the second key. Both the ciphertext and MAC are +stored on the server. + +For decryption, the a MAC is computed against the retrieved +ciphertext, and compared to the retrieved MAC. If the MACs differ, +that's an error and no decryption is done. If they do match, the +ciphertext is decrypted. + +Obnam will require the user to provide a passphrase, and will derive +the two keys from the single passphrase, using [PBKDF2][], rather than +having the user provide two passphrases. The derived keys will be +stored in file that only the owner can read. (This is simple, and good +enough for now, but needs to improved later.) + +When this is all implemented, there will be a setup step before the +first backup: + +~~~sh +$ obnam init +Passphrase for encryption: +Re-enter to make sure: +$ obnam backup +~~~ + +The `init` step asks for a passphrase, uses PBKDF2 (with the [pbkdf2 +crate][]) to derive the two keys, and writes a JSON file with the keys +into `~/.config/obnam/keys.json`, making that file be readable only by +the user running Obnam. Other operations get the keys from that file. + +The `init` step will not be optional. There will only be encrypted +backups. + +Obnam will use the [aes-gcm crate][] for AEAD, since it has been +audited. If that choice turns out to be less than optimal, it can be +reconsider later. + +The chunk sent to the server will be encoded as follows: + +* chunk format: a 32-bit unsigned integer, 0x0001 +* length of the MAC: a 32-bit unsigned integer +* the MAC +* length of ciphertext: a 32-bit unsigned integer +* the ciphertext + +The format version prefix allows for a modicum of future-proofing. + +[AEAD]: https://en.wikipedia.org/wiki/Authenticated_encryption#Authenticated_encryption_with_associated_data_(AEAD) +[MAC]: https://en.wikipedia.org/wiki/Message_authentication_code +[aes-gcm crate]: https://crates.io/crates/aes-gcm +[PBKDF2]: https://en.wikipedia.org/wiki/PBKDF2 +[pbkdf2 crate]: https://crates.io/crates/pbkdf2 + + # Acceptance criteria for the chunk server These scenarios verify that the chunk server works on its own. The -- cgit v1.2.1 From f60f3d81a17ed3b2ed6c105e5059b49ea70d91a3 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Sun, 28 Feb 2021 19:15:18 +0200 Subject: fix: update initial encryption plan, based on feedback --- obnam.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/obnam.md b/obnam.md index a7ba178..1940894 100644 --- a/obnam.md +++ b/obnam.md @@ -903,7 +903,7 @@ ciphertext is decrypted. Obnam will require the user to provide a passphrase, and will derive the two keys from the single passphrase, using [PBKDF2][], rather than having the user provide two passphrases. The derived keys will be -stored in file that only the owner can read. (This is simple, and good +stored in a file that only the owner can read. (This is simple, and good enough for now, but needs to improved later.) When this is all implemented, there will be a setup step before the @@ -920,29 +920,41 @@ The `init` step asks for a passphrase, uses PBKDF2 (with the [pbkdf2 crate][]) to derive the two keys, and writes a JSON file with the keys into `~/.config/obnam/keys.json`, making that file be readable only by the user running Obnam. Other operations get the keys from that file. +For now, we will use the default parameters of the pbkdf2 crate, to +keep things simple. (This will need to be made more flexible later: if +nothing else, Obnam should not be vulnerable to the defaults +changing.) The `init` step will not be optional. There will only be encrypted backups. Obnam will use the [aes-gcm crate][] for AEAD, since it has been audited. If that choice turns out to be less than optimal, it can be -reconsider later. +reconsider later. The `encrypt` function doesn't return the MAC and +ciphertext separately, so we don't store them separately. However, +each chunk needs its own [nonce][], which we will generate. We'll use +a 96-bit (or 12-byte) nonce. We'll use the [rand crate][] to generate +random bytes. The chunk sent to the server will be encoded as follows: * chunk format: a 32-bit unsigned integer, 0x0001 -* length of the MAC: a 32-bit unsigned integer -* the MAC -* length of ciphertext: a 32-bit unsigned integer +* a 12-byte nonce unique to the chunk * the ciphertext -The format version prefix allows for a modicum of future-proofing. +The format version prefix dictates the content and structure of the +chunk. This document defines version 1 of the format. The Obnam client +will refuse to operate on backup generations which use chunk formats +it cannot understand. + [AEAD]: https://en.wikipedia.org/wiki/Authenticated_encryption#Authenticated_encryption_with_associated_data_(AEAD) [MAC]: https://en.wikipedia.org/wiki/Message_authentication_code [aes-gcm crate]: https://crates.io/crates/aes-gcm [PBKDF2]: https://en.wikipedia.org/wiki/PBKDF2 [pbkdf2 crate]: https://crates.io/crates/pbkdf2 +[nonce]: https://en.wikipedia.org/wiki/Cryptographic_nonce +[rand crate]: https://crates.io/crates/rand # Acceptance criteria for the chunk server -- cgit v1.2.1 From 8622b26fb0bf2d7c3ee8bb3037439eb5d696d95f Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Thu, 11 Mar 2021 10:22:04 +0200 Subject: drop unnecessary "the" --- obnam.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/obnam.md b/obnam.md index 1940894..1f12f51 100644 --- a/obnam.md +++ b/obnam.md @@ -895,7 +895,7 @@ first key, giving a ciphertext. It then computes a MAC of the ciphertext using the second key. Both the ciphertext and MAC are stored on the server. -For decryption, the a MAC is computed against the retrieved +For decryption, a MAC is computed against the retrieved ciphertext, and compared to the retrieved MAC. If the MACs differ, that's an error and no decryption is done. If they do match, the ciphertext is decrypted. -- cgit v1.2.1 From b1934d85d1885f3dd7c6492f3df24d8d0fb648e4 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Thu, 11 Mar 2021 10:24:36 +0200 Subject: fix: tense --- obnam.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/obnam.md b/obnam.md index 1f12f51..c5ef2c9 100644 --- a/obnam.md +++ b/obnam.md @@ -930,7 +930,7 @@ backups. Obnam will use the [aes-gcm crate][] for AEAD, since it has been audited. If that choice turns out to be less than optimal, it can be -reconsider later. The `encrypt` function doesn't return the MAC and +reconsidered later. The `encrypt` function doesn't return the MAC and ciphertext separately, so we don't store them separately. However, each chunk needs its own [nonce][], which we will generate. We'll use a 96-bit (or 12-byte) nonce. We'll use the [rand crate][] to generate -- cgit v1.2.1 From 498bb29dbcfc913b876e2cf5a3389134dcf1f9b4 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Thu, 11 Mar 2021 10:43:15 +0200 Subject: fix: note little-endianness of chunk format version number --- obnam.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/obnam.md b/obnam.md index c5ef2c9..6521372 100644 --- a/obnam.md +++ b/obnam.md @@ -938,7 +938,8 @@ random bytes. The chunk sent to the server will be encoded as follows: -* chunk format: a 32-bit unsigned integer, 0x0001 +* chunk format: a 32-bit unsigned integer, 0x0001, store in + little-endian form. * a 12-byte nonce unique to the chunk * the ciphertext -- cgit v1.2.1