From 64cf9be06f0f3827e4358849b3643533b1a2a4d7 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Sun, 20 Mar 2022 14:30:40 +0200 Subject: add planning meeting for new iteration Sponsored-by: author --- blog/2022/03/21/planning.mdwn | 239 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 239 insertions(+) create mode 100644 blog/2022/03/21/planning.mdwn diff --git a/blog/2022/03/21/planning.mdwn b/blog/2022/03/21/planning.mdwn new file mode 100644 index 0000000..bdb79c1 --- /dev/null +++ b/blog/2022/03/21/planning.mdwn @@ -0,0 +1,239 @@ +[[!meta title="Iteration planning: March 21 – April 3"]] +[[!meta date="Wed, 21 Mar 2022 10:00:00 +0200"]] +[[!tag meeting]] + +[[!toc levels=2]] + +# Assessment of the iteration that has ended + +[previous iteration]: /blog/2022/03/06/planning + +The goal of the [previous iteration][] was: + +> The goal for this iteration is to prepare for future schema changes. + +This was completed. The Obnam client now supports more than one +version of the schema for backup generations, and can restore from any +of them. The server does not do that yet: if anything about it's +database schema changes, or it's API changes, the change is breaking +and necessitates starting over with a a new, empty repository with the +new server version. This will need to be addressed later. (See +[[!issue 199]].) + +# Discussion + +## Current development theme + +The current theme of development for Obnam is **convenience**. The choices +are performance, security, convenience, and tidy-up, at least +currently. + +## Breaking changes ahead + +Lars foresees several upcoming breaking changes to how Obnam +encryption is done, client/server authentication, and more. Ideally +these would all be done in ways that don't require users to start over +with their backups, but it seems like a lot of effort for a short-term +gain. Thus, Lars intends to take advantage of the fact that nobody +uses Obnam yet and make some fundamental changes over the next few +iterations. Part of those changes will be to make easier to evolve +Obnam without redo-all-your-backups changes, but some will be so +fundamental that it doesn't seem worth supporting both the old and new +ways. + +Lars plans the following such fundamental changes at the moment: + +* Add a "trusted root object" to the Obnam system, to replace the + current approach of "independent backup generation" objects. This + will increase security, as well as make ordering of backup + generations be more reliable. + - this change is planned for this iteration + - the old "generation chunk" approach will be dropped +* Add authentication to the client/server protocol. Details to be + discusses later. +* Refactor the server to have database schema versioning. +* Add versioning to the client/server protocol. + +After these, Lars hopes that Obnam will be in a state where it's +feasible to evolve the client and server mostly without the kind of +breaking changes that require starting over with an empty backup +repository. + +## User root object + +Currently, the Obnam client stores backup generations on the server +without an explicit ordering. Each generation has a timestamp, which +is used to sort the generations into an order, but that's not good +enough. See [[!issue 34]] (_Uses timestamps to order backup +generations_). + +* If two backups run overlap in time, they might create new backups + that are incremental to a common ancestor, but not related to each + other. This will at minimum be confusing. +* There's no guarantee the backup with a later timestamp is actually + newer: clock skew, and other errors, may affect things. +* Timestamps are cleartext data. This leaks information. Not good. +* Timestamps are not covered by a signature. Double-plus ungood. This + allows an attacker to change them, and they can make the client + think the latest backup is actually the oldest one. At minimum, this + means further incremental backups may back up files needlessly, but + may also mean the wrong backup gets restored. + +Lars proposes a change to protect against this threat model: + +* An attacker who can delete or modify files in the backup repository + must not be able to alter the contents or ordering of backup + generations. + +To fix this, I'm thinking of the following approach: + +* Each user has a "root object", which lists their backup generations, + and metadata of each generation. +* The root object is a chunk, so it's encrypted and authenticated with + AEAD. This prevents an attacker from modifying or inspecting it. +* The root object chunk has a random chunk id, but label "user" so + that a client can easily find it. It is otherwise exactly like other + chunks. +* Chunk metadata will be reduced to only `label`. The `generation` and + `ended` metadata for chunks will be removed. This will force another + breaking change, sorry. +* The root object will have a reference to the previous one. +* The client will find all root objects, and pick the newest one. This + is because the client has no way to tell the server to delete any + chunk, so it can't delete an old root object. +* Later, when we add client authentication, the server will store the + data in the root object associated with the client account, and + allow the client to update it. However, adding authentication is too + big a change for this iteration. + +A root object will contain could be serialized into JSON like this: + +~~~json +{ + "client": "exolobe1", + "previous_root": "6d381c04-a83a-11ec-a3e9-fba06bac23fd", + "timestamp": "2022-03-20T09:07:17+00:00", + "backups": [ + { + "chunk-id": "7cb90434-a82d-11ec-9383-b31e0b1b81aa", + "ended": "2022-03-20T09:07:17+00:00" + }, + { + "chunk-id": "7aa3681a-a82d-11ec-b24b-e3adb644891b", + "ended": "2022-03-21T09:07:17+00:00" + } + ] +} +~~~ + +The `backups` field has the generation, in order, with the oldest one +first. + +With this approach, everything linked from the root object chunk, or +found by following links further, can be assured to be in the right +order and to be unmodified. + +An attacker can still replace the root object chunk with an older one. +This can be mitigated by checking the root object timestamp: if it's +unexpectedly old, something is wrong. A stronger mitigation would be +for the client to store the timestamp locally and check it on the next +backup run. However, that requires data to not be lost on the client +end, which is what backups are meant to protect against, so it's not a +very satisfactory solution. + +If the limit for how old the root object chunk can be is too long, an +attacker can keep replacing the latest one with one that's as old as +it can be without trigger an alarm. That would mean that any +intervening backups get lost, which would be bad. + +Attacks on the root object may need to be mitigated in future +iterations. + + +# Repository review + +Lars reviewed all the open issues, merge requests, and CI pipelines +for all the projects in the Obnam group on gitlab.com. + +## [Container Images](https://gitlab.com/obnam/container-image) + +* Open issues: 0 +* Merge requests: 0 +* Additional branches: 0 +* CI: OK, ran on Monday, March 14 + +## [obnam.org](https://gitlab.com/obnam/obnam.org) + +* Open issues: 0 +* Merge requests: 0 +* Additional branches: 0 +* CI: not defined + +## [obnam-benchmark](https://gitlab.com/obnam/obnam-benchmark) + +* Open issues: 11 +* Merge requests: 0 +* Additional branches: 0 +* CI: not defined + +## [summain](https://gitlab.com/obnam/summain) + +* Open issues: 0 +* Merge requests: 0 +* Additional branches: 0 +* CI: not defined + +## [obnam](https://gitlab.com/obnam/obnam) + +* Open issues: 54 +* Merge requests: 2 + - [[!mr 214]] - _performance metrics_ + - needs thinking and further work + - [[!mr 222]] - _add backup database schema to evolove; break server + database_ + - to be merged on Tuesday +* Additional branches: 0 +* CI: OK + +# Goals + +## Goal for 1.0 (not changed this iteration) + +The goal for version 1.0 is for Obnam to be an utterly boring backup +solution for Linux command line users. It should just work, be +performant, secure, and well-documented. + +It is not a goal for version 1.0 to have been ported to other +operating systems, but if there are volunteers to do that, and to +commit to supporting their port, ports will be welcome. + +Other user interfaces is likely to happen only after 1.0. + +The server component will support multiple clients in a way that +doesn’t let them see each other’s data. It is not a goal for clients +to be able to share data, even if the clients trust each other. + +## Goal for the next few iterations (not changed for this iteration) + +The goal for next few iterations is to have Obnam be easier and safer +to change, both for developers and end users. This means that +developers need to be able to make breaking changes without users +having to suffer. User shall be able to migrate their data, when they +feel it worthwhile, not just because there is a new version. + +## Goal for this iteration (new for this iteration) + +The goal of this iteration is to add a "root object" for a user's +backups, which lists the backup generations in order. + +# Commitments for this iteration + +Lars intends to work on the "root object" change, as described above. +This will affect, and hopefully resolve the following issues: + +* [[!issue 34]] - _Uses timestamps to order backup generations_ +* [[!issue 62]] - _Describe how chunks relate to each other_ + +# Meeting participants + +* Lars Wirzenius -- cgit v1.2.1