summaryrefslogtreecommitdiff
path: root/bugs/salsa-tins.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'bugs/salsa-tins.mdwn')
-rw-r--r--bugs/salsa-tins.mdwn39
1 files changed, 0 insertions, 39 deletions
diff --git a/bugs/salsa-tins.mdwn b/bugs/salsa-tins.mdwn
deleted file mode 100644
index 4525dd4..0000000
--- a/bugs/salsa-tins.mdwn
+++ /dev/null
@@ -1,39 +0,0 @@
-[[!tag obnam-performance]]
-
-Problem: If chunk size is reasonably large (say, a megabyte), then
-most files will be smaller, and the repository ends up with a large
-number of identical files.
-
-Idea: collect chunks into groups, called "salsa tins".
-
-- salsa tin = list of chunks
-- salsa tin has an id
-- chunk id = salsa tin id + suitable number of extra bits for
- index into list
-- chunk id may be 64 bits total, or 64+32, or whatever seems convenient
-- no chunk gets stored alone, only in salsa tins
-
-This lets a client put things into the repository at will, without
-synchronisation or locking beyond what the filesystem provides
-(exclusive creation of files).
-
-
----
-
-Having multiple chunks in a single file complicates the logic for
-managing files in the repository, and deleting unused chunks.
-
-Therefore, an alternative idea: instead of shoving multiple chunks
-into one file, allow files to use parts of chunks. Currently a
-file's metadata lists the chunks that have its contents. Change
-this to be a list of (chunk id, offset, length) triplets, where
-offset and length specify a part of a chunk. This way, a client can
-create one chunk that contains the data of many small files, and
-they can all just use the relevant part of the chunk. Managing
-removal of those files is easy: it is the current code without
-modification.
-
---liw
-
-
-This is implemented in git for FORMAT GREEN ALBATROSS. [[done]] --liw