summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLars Wirzenius <liw@liw.fi>2022-03-20 14:30:40 +0200
committerLars Wirzenius <liw@liw.fi>2022-03-20 14:30:40 +0200
commit64cf9be06f0f3827e4358849b3643533b1a2a4d7 (patch)
treef810a530918bca60c4c867d2b4a7c3ca61476336
parent7ccd86d4a4bbfd73b30f5dd14ce06b460bd84b45 (diff)
downloadobnam.org-64cf9be06f0f3827e4358849b3643533b1a2a4d7.tar.gz
add planning meeting for new iteration
Sponsored-by: author
-rw-r--r--blog/2022/03/21/planning.mdwn239
1 files changed, 239 insertions, 0 deletions
diff --git a/blog/2022/03/21/planning.mdwn b/blog/2022/03/21/planning.mdwn
new file mode 100644
index 0000000..bdb79c1
--- /dev/null
+++ b/blog/2022/03/21/planning.mdwn
@@ -0,0 +1,239 @@
+[[!meta title="Iteration planning: March 21 &ndash; April 3"]]
+[[!meta date="Wed, 21 Mar 2022 10:00:00 +0200"]]
+[[!tag meeting]]
+
+[[!toc levels=2]]
+
+# Assessment of the iteration that has ended
+
+[previous iteration]: /blog/2022/03/06/planning
+
+The goal of the [previous iteration][] was:
+
+> The goal for this iteration is to prepare for future schema changes.
+
+This was completed. The Obnam client now supports more than one
+version of the schema for backup generations, and can restore from any
+of them. The server does not do that yet: if anything about it's
+database schema changes, or it's API changes, the change is breaking
+and necessitates starting over with a a new, empty repository with the
+new server version. This will need to be addressed later. (See
+[[!issue 199]].)
+
+# Discussion
+
+## Current development theme
+
+The current theme of development for Obnam is **convenience**. The choices
+are performance, security, convenience, and tidy-up, at least
+currently.
+
+## Breaking changes ahead
+
+Lars foresees several upcoming breaking changes to how Obnam
+encryption is done, client/server authentication, and more. Ideally
+these would all be done in ways that don't require users to start over
+with their backups, but it seems like a lot of effort for a short-term
+gain. Thus, Lars intends to take advantage of the fact that nobody
+uses Obnam yet and make some fundamental changes over the next few
+iterations. Part of those changes will be to make easier to evolve
+Obnam without redo-all-your-backups changes, but some will be so
+fundamental that it doesn't seem worth supporting both the old and new
+ways.
+
+Lars plans the following such fundamental changes at the moment:
+
+* Add a "trusted root object" to the Obnam system, to replace the
+ current approach of "independent backup generation" objects. This
+ will increase security, as well as make ordering of backup
+ generations be more reliable.
+ - this change is planned for this iteration
+ - the old "generation chunk" approach will be dropped
+* Add authentication to the client/server protocol. Details to be
+ discusses later.
+* Refactor the server to have database schema versioning.
+* Add versioning to the client/server protocol.
+
+After these, Lars hopes that Obnam will be in a state where it's
+feasible to evolve the client and server mostly without the kind of
+breaking changes that require starting over with an empty backup
+repository.
+
+## User root object
+
+Currently, the Obnam client stores backup generations on the server
+without an explicit ordering. Each generation has a timestamp, which
+is used to sort the generations into an order, but that's not good
+enough. See [[!issue 34]] (_Uses timestamps to order backup
+generations_).
+
+* If two backups run overlap in time, they might create new backups
+ that are incremental to a common ancestor, but not related to each
+ other. This will at minimum be confusing.
+* There's no guarantee the backup with a later timestamp is actually
+ newer: clock skew, and other errors, may affect things.
+* Timestamps are cleartext data. This leaks information. Not good.
+* Timestamps are not covered by a signature. Double-plus ungood. This
+ allows an attacker to change them, and they can make the client
+ think the latest backup is actually the oldest one. At minimum, this
+ means further incremental backups may back up files needlessly, but
+ may also mean the wrong backup gets restored.
+
+Lars proposes a change to protect against this threat model:
+
+* An attacker who can delete or modify files in the backup repository
+ must not be able to alter the contents or ordering of backup
+ generations.
+
+To fix this, I'm thinking of the following approach:
+
+* Each user has a "root object", which lists their backup generations,
+ and metadata of each generation.
+* The root object is a chunk, so it's encrypted and authenticated with
+ AEAD. This prevents an attacker from modifying or inspecting it.
+* The root object chunk has a random chunk id, but label "user" so
+ that a client can easily find it. It is otherwise exactly like other
+ chunks.
+* Chunk metadata will be reduced to only `label`. The `generation` and
+ `ended` metadata for chunks will be removed. This will force another
+ breaking change, sorry.
+* The root object will have a reference to the previous one.
+* The client will find all root objects, and pick the newest one. This
+ is because the client has no way to tell the server to delete any
+ chunk, so it can't delete an old root object.
+* Later, when we add client authentication, the server will store the
+ data in the root object associated with the client account, and
+ allow the client to update it. However, adding authentication is too
+ big a change for this iteration.
+
+A root object will contain could be serialized into JSON like this:
+
+~~~json
+{
+ "client": "exolobe1",
+ "previous_root": "6d381c04-a83a-11ec-a3e9-fba06bac23fd",
+ "timestamp": "2022-03-20T09:07:17+00:00",
+ "backups": [
+ {
+ "chunk-id": "7cb90434-a82d-11ec-9383-b31e0b1b81aa",
+ "ended": "2022-03-20T09:07:17+00:00"
+ },
+ {
+ "chunk-id": "7aa3681a-a82d-11ec-b24b-e3adb644891b",
+ "ended": "2022-03-21T09:07:17+00:00"
+ }
+ ]
+}
+~~~
+
+The `backups` field has the generation, in order, with the oldest one
+first.
+
+With this approach, everything linked from the root object chunk, or
+found by following links further, can be assured to be in the right
+order and to be unmodified.
+
+An attacker can still replace the root object chunk with an older one.
+This can be mitigated by checking the root object timestamp: if it's
+unexpectedly old, something is wrong. A stronger mitigation would be
+for the client to store the timestamp locally and check it on the next
+backup run. However, that requires data to not be lost on the client
+end, which is what backups are meant to protect against, so it's not a
+very satisfactory solution.
+
+If the limit for how old the root object chunk can be is too long, an
+attacker can keep replacing the latest one with one that's as old as
+it can be without trigger an alarm. That would mean that any
+intervening backups get lost, which would be bad.
+
+Attacks on the root object may need to be mitigated in future
+iterations.
+
+
+# Repository review
+
+Lars reviewed all the open issues, merge requests, and CI pipelines
+for all the projects in the Obnam group on gitlab.com.
+
+## [Container Images](https://gitlab.com/obnam/container-image)
+
+* Open issues: 0
+* Merge requests: 0
+* Additional branches: 0
+* CI: OK, ran on Monday, March 14
+
+## [obnam.org](https://gitlab.com/obnam/obnam.org)
+
+* Open issues: 0
+* Merge requests: 0
+* Additional branches: 0
+* CI: not defined
+
+## [obnam-benchmark](https://gitlab.com/obnam/obnam-benchmark)
+
+* Open issues: 11
+* Merge requests: 0
+* Additional branches: 0
+* CI: not defined
+
+## [summain](https://gitlab.com/obnam/summain)
+
+* Open issues: 0
+* Merge requests: 0
+* Additional branches: 0
+* CI: not defined
+
+## [obnam](https://gitlab.com/obnam/obnam)
+
+* Open issues: 54
+* Merge requests: 2
+ - [[!mr 214]] - _performance metrics_
+ - needs thinking and further work
+ - [[!mr 222]] - _add backup database schema to evolove; break server
+ database_
+ - to be merged on Tuesday
+* Additional branches: 0
+* CI: OK
+
+# Goals
+
+## Goal for 1.0 (not changed this iteration)
+
+The goal for version 1.0 is for Obnam to be an utterly boring backup
+solution for Linux command line users. It should just work, be
+performant, secure, and well-documented.
+
+It is not a goal for version 1.0 to have been ported to other
+operating systems, but if there are volunteers to do that, and to
+commit to supporting their port, ports will be welcome.
+
+Other user interfaces is likely to happen only after 1.0.
+
+The server component will support multiple clients in a way that
+doesn’t let them see each other’s data. It is not a goal for clients
+to be able to share data, even if the clients trust each other.
+
+## Goal for the next few iterations (not changed for this iteration)
+
+The goal for next few iterations is to have Obnam be easier and safer
+to change, both for developers and end users. This means that
+developers need to be able to make breaking changes without users
+having to suffer. User shall be able to migrate their data, when they
+feel it worthwhile, not just because there is a new version.
+
+## Goal for this iteration (new for this iteration)
+
+The goal of this iteration is to add a "root object" for a user's
+backups, which lists the backup generations in order.
+
+# Commitments for this iteration
+
+Lars intends to work on the "root object" change, as described above.
+This will affect, and hopefully resolve the following issues:
+
+* [[!issue 34]] - _Uses timestamps to order backup generations_
+* [[!issue 62]] - _Describe how chunks relate to each other_
+
+# Meeting participants
+
+* Lars Wirzenius