[[!meta title="Repository format 'Green Albatross'"]] This page describes the repository format called 'Green Albatross'. It is a preliminary format, intended to improve Obnam performance over [[format-6]]. Current development status is PONDERING; implementatation has started, but everything on this page may and probably will change. Until declared stable, the on-disk data structured WILL change without warning. Only use this format to help with its development. Introduction ============ For background and the big picture, please read the [[ondisk]] page. This page only discusses details specific to this format. This repository format takes the following approach: * Don't use the [Larch](http://liw.fi/larch/) B-tree implementation. It's intricate code and insufficiently fast. Implement any data structures directly. * With few, specific exceptions, repository files are never updated. Everything is done copy-on-write. This enables caching. The exceptions are root nodes of DAGs, so that it's easy to know where the DAG starts. * Data is stored in objects of various types. Objects may be small, and to avoid having a file per object, objects are collected into bags. This includes chunks. Each bag is stored in its own file. Objects are identified by the bag identifier, plus an index into the bag. * Objects are stored in a suitable serialisation encoding. - This is not Python pickles, since Obnam can't assume those are stable over the lifetime of a backup repository. - JSON is not used, since JSON is not suitable for storing binary data, such as filenames, without adding an encoding layer on top of JSON. - Previously this was specified as YAML, but YAML is not fast to parse. So it's not YAML. - Instead, a simple, custom binary encoding will be used. This will encode ints, booleans, binary strings, or lists or dicts (dict keys being lists). A quick prototype shows this to be easy (worked the first time), and fairly fast even without optimisation. * Encryption is done exactly like in [[format-6]]. Bags ==== A bag is used to store a number of blobs. Bags are identified by a random 64-bit integer. This is used as the filename of the bag. The bags are stored in a 3-level directory structure, using the top three octets of the bag id as the directory names. Thus, a bag whose id in hexadecimal is 0x1234567890abcdef would be stored as `12/34/56/1234567890abcdef.bag`. A bag is implemented as a Python `dict` object: { 'bag-id': 'cafef00d', 'blobs': [...], } The `items` field contains the blobs. Each blob may be an arbitrary byte string (for chunks), or an encoded Python object. Object identifiers ================== Object identifiers are a pair consisting of the bag id and an index into the bag. Since the identifiers are used frequently, it is practical to store them as a unit rather than as a pair. Further, they will be visible to the user (and, especially, the developer), so the following syntax is used: BAGID.INDEX For example, the first and third objects stored in the bag with id 0x1234567890abcdef would be: 1234567890abcdef.0 1234567890abcdef.2 Note the use of hexadecimal for the bag id (so all bag identifiers are of the same length), and indexing in decimal, starting from zero. We will keep bags effectively immutable so that an object id does not need to change. This means that a bag may contain unused objects. If it turns out that that's wasting too much data, we can "pack" bags by replacing the unused blobs with empty values (Python's None) to save space. This mutates a bag, but only in ways that (correct) users won't notice. Client list =========== The client list is stored as `client-list/data.bag` in the repository, and each item in the bag has the following structure: { 'client-name': 'foo', 'client-id': 123, 'encryption-key': None, } Chunks ====== Chunks are stored in bags. The chunk data is just a binary blob. Chunk indexes ============= Chunk indexes map a checksum (using the user's chosen algorithm) to a list of chunk ids, and a chunk id to a list of client ids. The root object of the indexes is: { 'checksums': { 'algorithm': 'SHA-1', 'root-id': '1234567890abcdef.1', }, 'used-by-tree': '1234567890abcdef.0,, } Checksum to chunk ids --------------------- The mapping from a checksum value to a list of chunk ids is done using a lookup tree that is vaguely similar to a B-tree. The tree contains index nodes and leaf nodes. Leaf nodes store the actual mappings: { 'e242ed3bffccdf271b7fbaf34ed72d089537b42f': [ '1234567890abcdef.3', '1234567890abcdef.4', ], 'f1d2d2f924e986ac86fdf7b36c94bcdf32beec15': [ '1234567890abcdef.5', ], } In other words, a leaf node is a `dict` mapping a checksum to a list of chunk ids whose content has that checksum. Note that the list may be longer than one item. An index node looks like this: [ { 'first-checksum': 'e242ed3bffccdf271b7fbaf34ed72d089537b42f', 'last-checksum': 'f1d2d2f924e986ac86fdf7b36c94bcdf32beec15', 'leaf-id': '1234567890abcdef.2', 'index-id': None, }, ] In other words: * The index node is a list of mappings, where each mapping corresponds to an object on the next level in the lookup tree. * `first-checksum` is the smallest checksum in the sub-tree being referenced. * `last-checksum` is the largest checksum. * `leaf-id` is the object id of the leaf node, assuming the next level is leaf nodes. * `index-id` is the object id of the index node, assuming the next level is index nodes. The lookup tree is created in a copy-on-write manner. No node is ever overwritten, but it may be deleted after it is no longer referenced. The tree is not kept in balance, to keep the code maintaining as simple as possible. When a new checksum is inserted into the lookup tree, it is added to an existing leaf node by creating a new leaf node that is a copy of the old one, and adding the new checksum to the new leaf. A big leaf node is split in half. Any index nodes on the path to an updated leaf node get updated. Chunk id to client ids ---------------------- This tree is similar in structure as the checksum tree, but index nodes look like this: [ { 'first-client-id': '1234567890abcdef.50', 'last-client-id': '1234567890abcdef.51', 'leaf-id': '1234567890abcdef.49', }, ] Leaf nodes look like this: { '1234567890abcdef.32': [ 123, 456 ], } Leaf nodes map a chunk id to a list of ids of clients that use the chunk. Per-client data =============== The data stored for each client separately starts with a CLIENT object: { 'client-name': 'foo', 'generations': ['1234567890abcdef.77'], } Each generation is a GEN object: { 'started': '2015-04-06T17.35:41', 'ended': '2015-04-06T17.35:42', 'checkpoint': False, 'root-dir': '1234567890abcdef.77', } Each directory in the live data is stored in a DIR object. The object stores the metadata for the directory, plus the basenames and metadata for each file in the directory (anything thing that isn't a subdirectory), plus the basename and DIR object id of each subdirectory. { 'metadata': { 'st_mode': 0777, 'st_uid': 0, 'username': 'root', }, 'files': 'README': 'metadata': 'st_mode': 0644, 'st_uid': 0, 'username': 'root' 'subdirs': '.git': '1234567890abcdef.127', } Again, nothing is updated in-place, everything is updated copy-on-write. When a node is updated, its parent all the way to the root directory also get updated.