From f22994cc99017f5aeab809a406b0c5d1940fd5a8 Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Thu, 23 Jul 2015 21:53:33 +0300 Subject: Add scenario for repo with lost chunks --- yarns/0120-corrupt-repo.yarn | 65 ++++++++++++++++++++++++++++++++++++++++++++ yarns/9000-implements.yarn | 37 +++++++++++++++++++++++++ 2 files changed, 102 insertions(+) create mode 100644 yarns/0120-corrupt-repo.yarn (limited to 'yarns') diff --git a/yarns/0120-corrupt-repo.yarn b/yarns/0120-corrupt-repo.yarn new file mode 100644 index 00000000..231738f5 --- /dev/null +++ b/yarns/0120-corrupt-repo.yarn @@ -0,0 +1,65 @@ +Robustness: dealing with repository corruption +============================================== + +A repository may be corrupted in various ways, including due to bugs +in Obnam itself. Obnam needs to be robust against this, and do as well +as it can, even when the repository isn't quite as good as it might +be. For example, it should be able to restore data that is still in +the repository. + +The scenario in this chapter handles a specific class of repository +corruption: file data ("chunks") that have gone missing. As of +Obnam 1.12, there are known to be bugs that cause that to happen. +Hopefully, once these scenarios pass, the bugs will either be fixed, +or at least are handled without crashing by later Obnam operations. + + SCENARIO handle missing file chunks + +First, let's create a repository that's OK. We'll make two backup +generations, with some changes to live data in between. + + GIVEN 10k of data in file L/foo + AND a manifest of L in M + WHEN user U backs up directory L to repository R + GIVEN a manifest of R in MR + AND a copy of R in R1 + + GIVEN 20k of data in file L/bar + AND a copy of L/foo in L/foocopy + WHEN user U backs up directory L to repository R + +We now have the first generation that has just the file `L/foo`, and +the second generation that has `L/bar` and `L/foocopy`, and the latter +is identical to the `L/foo`. Because it is identical, it will re-use +the file chunks of `L/foo`. + +If we now remove the chunks that were created by the second backup, +the first generation is intact, but the second generation's `L/bar` +file is corrupt (it's chunks are missing). + + WHEN repository R resets its chunks to those in R1 + +We should now be able to restore the first generation without +problems. + + WHEN user U restores generation 1 to X1 from repository R + THEN L, restored to X1, matches manifest M + +Restoring the second generation should fail, partially. + + WHEN user U attempts to restore their latest generation + ... in repository R into X2 + THEN the attempt failed with exit code 1 + AND the error message matches "L/bar: R43272X" + AND file L/foo, restored to X2, matches live data + AND file L/foocopy, restored to X2, matches live data + +We should be able to remove the second generation, despite the missing +chunk. + + WHEN user U forgets their latest generation in repository R + THEN user U sees 1 generation in repository R + + WHEN user U restores their latest generation in repository R + ... into X3 + THEN L, restored to X3, matches manifest M diff --git a/yarns/9000-implements.yarn b/yarns/9000-implements.yarn index 56f96912..204611cf 100644 --- a/yarns/9000-implements.yarn +++ b/yarns/9000-implements.yarn @@ -114,6 +114,28 @@ Sometimes we need to remove a file. IMPLEMENTS WHEN user (\S+) removes file (\S+) rm -f "$DATADIR/$MATCH_2" +Copy a file. + + IMPLEMENTS GIVEN a copy of (.+) in (.+) + mkdir -p "$DATADIR/$(dirname "$MATCH_2")" + cp -a "$DATADIR/$MATCH_1" "$DATADIR/$MATCH_2" + +Reset a repository's chunk files. + + IMPLEMENTS WHEN repository (.+) resets its chunks to those in (.+) + r1="$DATADIR/$MATCH_1" + r2="$DATADIR/$MATCH_2" + if [ -e "$r1/chunks" ] + then + # format 6 + rm -rf "$r1/chunks" + cp -a "$r2/chunks" "$r1/." + else + rm -rf "$r1/chunk-store" + cp -a "$r2/chunk-store" "$r1/." + fi + + Manifest creation and checking ------------------------------ @@ -327,6 +349,15 @@ Remove the oldest generation. head -n1 | grep .) run_obnam "$MATCH_1" forget -r "$DATADIR/$MATCH_2" "$id" +Remove the newest generation. + + IMPLEMENTS WHEN user (\S+) forgets their latest generation in repository (\S+) + # The grep below at the end of pipeline is there to make sure + # the pipeline fails if there were no generations. + id=$(run_obnam "$MATCH_1" -r "$DATADIR/$MATCH_2" genids | + tail -n1 | grep .) + run_obnam "$MATCH_1" forget -r "$DATADIR/$MATCH_2" "$id" + Remove according to a `--keep` schedule. IMPLEMENTS WHEN user (\S+) forgets according to schedule (\S+) in repository (\S+) @@ -598,6 +629,12 @@ by the user. IMPLEMENTS WHEN user (\S+) reads file (\S+) cat "$DATADIR/$MATCH_2" +Does a restored file match what's in live data? + + IMPLEMENTS THEN file (.+), restored to (.+), matches live data + cmp "$DATADIR/$MATCH_1" "$DATADIR/$MATCH_2/$DATADIR/$MATCH_1" + + Check on user running test suite -------------------------------- -- cgit v1.2.1