summaryrefslogtreecommitdiff
path: root/yarns/0030-basics.yarn
blob: 226ec2a828be8aab277a18847620484746d8d9c8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
Basic operation: backup and restore
===================================

This chapter tests the basic operation of Obnam: backing up and
restoring data. Tests in this chapter only concern themselves with a
single generation; see later for tests for multiple generations.

The goal of this chapter is to test Obnam with every kind of data,
every kind of file, and every kind of metadata.

Backup simple data
------------------

This is the simplest of all simple backup tests: generate a small
amount of data in regular files, in a single directory, and backup
that. No symlinks, no empty files, no extended attributes, no nothing.
Just a few files with a bit of data in each. This is what every backup
program must be able to handle.

    SCENARIO backup simple data
    GIVEN 100kB of new data in directory L
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M
    AND user U can fsck the repository R

Backup sparse files
-------------------

Sparse files present an interesting challenge to backup programs. Most
people have none, but some people have lots, and theirs can have very
large holes. For example, at work I often generate disk images as
raw disk images in sparse files. The image may need to be, say 30
gigabytes in size, even though it only contains one or two gigabyte of
data. The rest is a hole.

A backup program should restore a sparse file as a sparse file.
Otherwise, the 30 gigabyte disk image file will, upon restore, use 30
gigabytes of disk space, rather than one. That might make restoring
impossible.

Unfortunately, it is not easy to (portably) check whether a file is
sparse. We'll settle for making sure the restored file does not use
more disk space than the one in live data.

    SCENARIO backup a sparse file
    GIVEN a file S in L, with a hole, data, a hole
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M
    AND file S from L, restored in X doesn't use more disk

Backup all interesting file and metadata types
----------------------------------------------

The Unix filesystem abstraction is surprisingly complicated. Indeed,
it can come as a surprise to anyone who's not implemented a backup
program with the intention of being able to restore the live data set
exactly. To complicate things further, different filesystems have
different features, and different Unix-like operating systems don't
all implement all the features, and implement some features
differently.

We need to ensure Obnam can handle anything it encounters, on any
supported platform. That is the purpose of the scenarios in this
section. There are some limitations, though: the test suite is not run
as the `root` user, and thus we don't deal with filesystem objects
that require privileged operations such as device node creation. We
also don't, in these scenarios, handle multiple filesystem types: the
test suite should, instead, be run multiple types, with `TMPDIR` set
to point at a different filesystem type each time: we leave that to
the user running the test suite.

We rely on a helper tool in the Obnam source tree, `mkfunnyfarm`, to
create all the interesting filesystem objects, rather than spelling
them out in the scenarios. This is because that helper tool is used by
other parts of Obnam's test suite as well, and this reduces code
duplication.

    SCENARIO backup non-basic filesystem objects
    ASSUMING extended attributes are allowed for users
    GIVEN directory L with interesting filesystem objects
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M

As a special case, Obnam needs to notice when only an extended
attribute value changes.

    SCENARIO backup notices when extended attribute value changes
    ASSUMING extended attributes are allowed for users
    GIVEN a file F in L, with data
    AND file L/F has extended attribute user.foo set to foo
    WHEN user U backs up directory L to repository R
    GIVEN file L/F has extended attribute user.foo set to bar
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M

Backup without changes
----------------------

If we run a backup, then a new one, then the new generation should
match the first one, and no files should be have been backed up in the
second generation.

    SCENARIO no-op backup
    GIVEN a file F in L, with data
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M

Remove the Obnam log file, so we only have the log from the next
backup run.

    WHEN user U removes file obnam.log
    AND user U backs up directory L to repository R
    THEN obnam.log matches INFO \* files backed up: 0$
    AND L, restored to X, matches manifest M


Backup when a file or directory is unreadable
---------------------------------------------

The backup shouldn't fail even if a file or directory is inaccessible.

    SCENARIO unreadable live data file

We can't run this test as the `root` user, since then everything is
readable.

    ASSUMING not running as root

Create some live data, and a file that is unreadable.

    GIVEN 1k of new data in directory L
    AND file L/unreadable-file with permissions 000
    WHEN user U attempts to back up directory L to repository R
    THEN the error message matches "RCE08AX.*L/unreadable-file"
    WHEN user U attempts to verify L against repository R
    THEN the attempt succeeded

Next, let's do the same thing again, but with an unreadable directory
instead of a file.

    SCENARIO unreadable live data directory
    ASSUMING not running as root
    GIVEN 1k of new data in directory L
    AND directory L/unreadable-dir with permissions 000
    WHEN user U attempts to back up directory L to repository R
    THEN the error message matches "RD5FA4X.*L/unreadable-dir"
    WHEN user U attempts to verify L against repository R
    THEN the attempt succeeded


Backup to roots at once
-----------------------

Often it's useful to backup more than one location at once. We'll
assume that if we can backup two, then it'll all work well.

    SCENARIO backup two roots
    GIVEN 4kB of new data in directory L1
    AND 16kB of new data in directory L2
    AND a manifest of L1 in M1
    AND a manifest of L2 in M2
    WHEN user U backs up directories L1 and L2 to repository R
    AND user U restores their latest generation in repository R into X
    THEN L1, restored to X, matches manifest M1
    THEN L2, restored to X, matches manifest M2

Checkpoint generations
----------------------

Obnam is meant to remove checkpoint generations it created during a
backup, if the backup finishes successfully.

    SCENARIO checkpoint generations are removed
    GIVEN 100kB of new data in directory L
    AND user U sets configuration checkpoint to 1k
    WHEN user U backs up directory L to repository R
    THEN user U sees no checkpoint generations in repository R

Restore a single file
---------------------

We need to be able to restore only a single file. Note that when
restoring a single file, we do not set the parent directory's
modification time according to the backup, so we need to manipulate
the manifest to avoid getting an error.

    SCENARIO restore a single file
    GIVEN a file F in L, with data
    AND a manifest of L/F in M
    WHEN user U backs up directory L to repository R
    AND user U restores file L/F to X from their latest generation in repository R
    THEN L/F, restored to X, matches manifest M

Restores must happen to a non-existent or an empty directory
------------------------------------------------------------

To avoid people doing unfortunate things such as `obnam restore
--to=/` we make sure the target directory of restore either does not
exist, or it's empty.

    SCENARIO restore only to empty or new target
    GIVEN 1kB of new data in directory L
    AND a manifest of L in M
    AND 0kB of new data in directory EMPTY
    AND 2kB of new data in directory NOTEMPTY

    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into EMPTY
    THEN L, restored to EMPTY, matches manifest M

    WHEN user U restores their latest generation in repository R into NOTEXIST
    THEN L, restored to NOTEXIST, matches manifest M

    WHEN user U attempts to restore their latest generation 
    ... in repository R into NOTEMPTY
    THEN the attempt failed with exit code 1


Pretend backing up: the `--pretend` setting
-------------------------------------------

The `--pretend` setting lets the user pretend they're doing a backup,
without actually having anything backed up. This is useful for testing
that the configuration is correct: the fake backup runs much faster
than a real one.

    SCENARIO a pretend backup
    GIVEN 10kB of new data in directory L
    WHEN user U backs up directory L to repository R
    GIVEN a manifest of R in M1
    WHEN user U pretends to back up directory L to repository R
    GIVEN a manifest of R in M2
    THEN manifests M1 and M2 match 

Exclude cache directories
-------------------------

The [Cache directory tagging] standard provides an easy way to mark
specific directories as cache directories, which means their data is
easy to re-create (or re-download). Such data is often not worth
backing up. The `--exclude-caches` option tells Obnam to exclude any
directories tagged like that.

[Cache directory tagging]: http://www.bford.info/cachedir/

    SCENARIO exclude cache directories
    GIVEN 1k of new data in directory L/wanted
    AND 1k of new data in directory L/cache
    AND directory L/cache is tagged as a cache directory

We'll now create the manifest, but remove `L/cache` (and files in
`L/cache`) so that it matches what we need. We do it this instead of
creating the manifest before `L/cache`, because creating `L/cache`
changes the timestamp of `L`.

    AND a manifest of L in M
    AND cache is removed from manifest M

Time to backup.

    AND user U sets configuration exclude-caches to yes
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M

Excluded, already backed up files, are not included in next generation
----------------------------------------------------------------------

Until Obnam version 1.7.4, but fixed after that, Obnam had a bug where
a file that was not excluded became excluded was not removed from new
backup generations. In other words, if file `foo` exists and is backed
up, and the user then makes a new backup with `--exclude=foo`, the new
backup generation still contains `foo`. This is clearly a bug. This
scenario verifies that the bug no longer exists, and prevents it from
recurring.

    SCENARIO new generation drops excluded, previously backed up files
    GIVEN a file foo in L, with data
    WHEN user U backs up directory L to repository R
    GIVEN user U sets configuration exclude to foo
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, is empty

Changing backup roots
---------------------

When we change the backup roots, i.e., the directories we want backed
up, we do not want the any dropped backup roots to be included in the
new backup.

    SCENARIO replace backup root with new one
    GIVEN 1k of new data in directory L1
    AND 1k of new data in directory L2
    WHEN user U backs up directory L1 to repository R
    AND user U backs up directory L2 to repository R
    AND user U lists latest generation in repository R into F
    THEN nothing in F matches L1

Pre-epoch timestamps
--------------------

It's possible to have timestamps before the epoch, i.e., negative
ones. For example, in the UK during DST, `touch -t 197001010000` will
create one. Test that such timestamps work.

    SCENARIO pre-epoch timestamps
    GIVEN file L/file has Unix timestamp -3600
    AND a manifest of L in M
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M

Change B-tree node size
-----------------------

The setting for B-tree node size (`--node-size`) only affects new
B-trees. Thus, if we've backed up with one size, and change the
setting to a new size, the backup should still work.

    SCENARIO backup with changed B-tree node size
    GIVEN 100kB of new data in directory L
    AND user U sets configuration node-size to 65536
    WHEN user U backs up directory L to repository R
    GIVEN 100Kb of new data in directory L
    AND a manifest of L in M
    AND user U sets configuration node-size to 4096
    WHEN user U backs up directory L to repository R
    AND user U restores their latest generation in repository R into X
    THEN L, restored to X, matches manifest M
    AND user U can fsck the repository R