summaryrefslogtreecommitdiff
path: root/yarns/0200-repo-formats.yarn
blob: 713cdebfa4898d6364133c81cd1c6d8d0869f6c4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
Multiple repository format handling
===================================

Obnam supports (or, rather, will in the future; FIXME) several
repository formats. As Obnam development progresses, there is
sometimes a need to change the way data is stored in the backup
repository, for example to allow speed optimisations, or to support
more kinds of file metadata. However, it would be silly to invalidate
all existing backups people have made with Obnam (never mind that
until version 1.0 Obnam did exactly that). Thus, Obnam attempts to
support every format any released version has been able to create
since version 1.0.

The tests in this chapter verify that each such repository format
still works. The source tree contains a directory of archived backup
repositories (in tar archives) and the tests will unpack those, and
verify that they can be restored from correctly. The verification is
done using a `summain` manifest for each generation, stored in the
tar archive with the repository.

Each tar archive will contain a directory `repo`, which is the backup
repository, and `manifest-1` and `manifest-2`, which are the manifests
for the first and second generation.

Repository format 6 (Obnam version 1.0)
---------------------------------------

The repository format 6 is the one used for the 1.0 release of Obnam.
We have two variants of reference repositories: a normal one, and one
using the miserable `--small-files-in-btree` option. It's miserable,
because it complicates the code but doesn't actually make anything
better.

First, the normal one reference repository.

    SCENARIO use repository format 6
    ASSUMING extended attributes are allowed for users
    GIVEN unpacked test data from test-data/repo-format-6-encrypted-deflated.tar.xz in T
    WHEN user havelock restores generation 1 in T/repo to R1
    THEN restored data in R1 matches T/manifest-1
    WHEN user havelock restores generation 2 in T/repo to R2
    THEN restored data in R2 matches T/manifest-2

Then, the in-tree repository.

    SCENARIO use repository format 6 with in-tree data
    ASSUMING extended attributes are allowed for users
    GIVEN unpacked test data from test-data/repo-format-6-in-tree-data.tar.xz in T
    WHEN user havelock restores generation 1 in T/repo to R1
    THEN restored data in R1 matches T/manifest-1
    WHEN user havelock restores generation 2 in T/repo to R2
    THEN restored data in R2 matches T/manifest-2

Implementations
---------------

The following scenario steps are only ever used by scenarios in this
chapter, so we implement them here.

First, we unpack the test data into a known location.

    IMPLEMENTS GIVEN unpacked test data from (\S+) in (\S+)
    mkdir "$DATADIR/$MATCH_2"
    tar -C "$DATADIR/$MATCH_2" -xf "$MATCH_1"

Then we restore the requested generation. Note the use of the
`--always-restore-setuid` option. Without it, the setuid/setgid bits
get restored only if the tests are being run by the `root` user, or a
user with the same uid as recorded in the reference repository. That
would almost always break the test for other people, including CI.

    IMPLEMENTS WHEN user (\S+) restores generation (\d+) in (\S+) to (\S+)
    # Copy the keyrings from source tree so they don't get modified
    # by this test.
    cp -a "$SRCDIR/test-gpghome" "$DATADIR/.gnupg"
    export GNUPGHOME="$DATADIR/.gnupg"
    genid=$(run_obnam "$MATCH_1" -r "$DATADIR/$MATCH_3" \
        --encrypt-with=3B1802F81B321347 genids | sed -n "${MATCH_2}p")
    run_obnam "$MATCH_1" -r "$DATADIR/$MATCH_3" \
        --encrypt-with=3B1802F81B321347 \
        restore --to "$DATADIR/$MATCH_4" --generation "$genid" \
        --always-restore-setuid

Finally, we verify the restored data against the manifest. We have one
tricky bit here: there is no guarantee what the path to the root of
the live data is in the repository, but we search downwards until we
find a directory with more than one child. That's what we match
against the manifest.

    IMPLEMENTS THEN restored data in (\S+) matches (\S+)
    cd "$DATADIR/$MATCH_1"
    while true
    do
        case $(ls | wc -l) in
            1) cd * ;;
            0) echo "No children, oops" 1>&2; exit 1 ;;
            *) break ;;
        esac
    done
    summain -r --exclude=Ino --exclude=Dev --exclude=Uid \
        --exclude=Username --exclude=Gid --exclude=Group \
        --checksum=SHA1 \
        . | normalise_manifest_times > "$DATADIR/restored-manifest"
    normalise_manifest_times "$DATADIR/$MATCH_2" \
        > "$DATADIR/original-manifest"
    diff -u "$DATADIR/original-manifest" "$DATADIR/restored-manifest"