summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorjandersonlee@23dda5f3b375f2e05809d1438817748bb3b98d6e <jandersonlee@web>2016-04-29 03:46:53 +0000
committeradmin <admin@branchable.com>2016-04-29 03:46:53 +0000
commit9ed4cc8827ad3153c2540a4ba64d3ce743ec9696 (patch)
tree24fd101340df999ead3bb8bf5f7a889ed5e142ce
parent703bc13e517c854963b3e967698ce6747faa9cfb (diff)
downloadobnam.org-9ed4cc8827ad3153c2540a4ba64d3ce743ec9696.tar.gz
-rw-r--r--format-green-albatross.mdwn22
1 files changed, 1 insertions, 21 deletions
diff --git a/format-green-albatross.mdwn b/format-green-albatross.mdwn
index aab64ff..f1486d4 100644
--- a/format-green-albatross.mdwn
+++ b/format-green-albatross.mdwn
@@ -55,24 +55,6 @@ A bag is implemented as a Python `dict` object:
'blobs': [...],
}
-The `blobs` field contains the blobs. Each blob may be an arbitrary
-byte string (for chunks), or an encoded Python object.
-
-Note that having a fixed number of 3 levels requires up to 16Mi+64Ki+256 (2^24+2^16+2^8) directories in the repository. If there are fewer bags, many will be the only file in their directory. If there are more than 4Gi bags, many of the bottom directories will have more than 256 bag files. Since bag ids are 64bit numbers this could conceivably be many bags per leaf directory in a huge multi-TB repo.
-
-An alternative might be to have a get_bag() and put_bag() function for the repo that determine how many levels to use and a `repo.conf` file at the top level that specifies either the exact or maximum number of levels to try. A repo configured with {'minlevels':2, 'maxlevels': 3}, would look first for `12/34/56/1234567890abcdef.bag`, then `12/34/1234567890abcdef.bag` (one less level). If configured with {'levels': 3}, it would only look for `12/34/56/1234567890abcdef.bag`.
-
-Moving files between directories is a fairly cheap operation in most filesystems and changing the number of levels could be done by an auxiliary process if needed when a repo starts to fill up. This will be rare as it requires a 256X expansion to add a level. However having the flexibility over the number of levels may make this format more generally useful for both large and small use cases and does not add much code complexity.
-
-Preparing for the Cloud
------------------------
-
-While obnam is intended for disk/sftp backup its ability to have multiple repo backends makes it an ideal fit for creating a backup system that can
-also tap into "the cloud". With a little forethought it may be made even more suitable to this task.
-
-The repository consists primarily of immutable numbered bags of objects, plus a few configuration files and a few special mutable bags. Most bags are numbered (and normally mapped onto files as above). A few special bags (like the client list mentioned below) are named.
-
-This lays open the possibility for repositories that can stash most of the immutable bags remotely (in the cloud), possibly caching or replicating some locally. To speed up backup processing we may want to separate out those bags that hold only file chunks from those that store more essential data (such as chunk indices and/or directory trees and metadata), and mark the the latter to be cached locally. Replicating/caching just the chunk index locally and storing the chunks remotely would provide for faster backups with less local overhead. Also replicating or caching the directory tree and metadata information locally would make a tree-walk faster as well.
Object identifiers
==================
@@ -94,7 +76,7 @@ For example, the first and third objects stored in the bag with id
Note the use of hexadecimal for the bag id (so all bag identifiers are
of the same length), and indexing in decimal, starting from zero.
-We will keep numbered bags effectively immutable so that an object id does not
+We will keep bags effectively immutable so that an object id does not
need to change. This means that a bag may contain unused objects. If
it turns out that that's wasting too much data, we can "pack" bags by
replacing the unused blobs with empty values (Python's None) to save
@@ -113,8 +95,6 @@ and each item in the bag has the following structure:
'encryption-key': None,
}
-This bag is mutable. When a new client is added, it is overwritten.
-
Chunks
======