summaryrefslogtreecommitdiff
path: root/manual/en/080-forgetting.mdwn
blob: 1df2ff640a6abb93840c96e6462067107b37f020 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
Forgetting old backup generations
=================================

Every time you make a backup, your backup repository grows in size. To
avoid overflowing all available storage, you need to get rid of some
old backups. That's a bit of a dilemma, of course: you make backups in
order to not lose data and now you have to do exactly that.

Obnam uses the term "forget" about removing a backup generation.
You can specify which generation to remove manually, by generation
identifier, or you can have a schedule to forget them automatically.

To forget a specific generation:

    obnam forget 2

(This example assumes you have a configuration file that Obnam finds
automatically, and that you don't need to specify things like
repository location or encryption on the command line.)

You can forget any individual generation. Thanks to the way Obnam
treats every generation as an independent snapshot (except it's not
really a full backup every time), you don't have to worry about the
distinctions between a full and incremental backup.

Forgetting backups manually is tedious, and you probably want to use a
schedule to have Obnam automatically pick the generations to forget.
A common type of schedule is something like this:

* keep one backup for each day for the past week
* keep one backup for each week for the past three months
* keep one backup for each month for the past two years
* keep one backup for each year for the past fifty seven years

Obnam uses the `--keep` setting to specify a schedule. The setting 
for the above schedule would look like this:

    --keep 7d,15w,24m,57y

This isn't an exact match, due to the unfortunate ambiguity of how
long a month is in weeks, but it's close enough. The setting "7d" is
interpreted as "the last backup of each calendar day for the last
seven days on which backups were made". Similarly for the other parts
of the schedule. See the "Obnam configuration files and settings"
chapter for exact details.

The schedule picks a set of generations to keep. Everything else gets
forgotten.

Choosing a schedule for forgetting generations
----------------------------------------------

The schedule for retiring backup generations is a bit of a guessing
game, just like backups in general. If you could reliably tell the
future, you'd know all the disasters that threaten your data, and you
could backup only the things that would otherwise be lost in the
future.

In this reality, alas, you have to guess. You need to think about what
risks you're facing (or your data is), and how much backup storage
space you're willing to spend on protecting against them.

* Are you afraid of your hard drive suddenly failing in a very
  spectacular way, such as by catching fire or being stolen? If so,
  you really only need one very recent backup to cover against that.
* Do you worry about your hard drive, or filesystem, or your
  application programs, or you yourself, slowly corrupting your data
  over a longer period of time? How long will it take you to find that
  out? You need a backup history that lasts longer than it takes for
  you to detect slow corruption.
* Likewise with accidental deletion of files. How long will it take
  for you to notice? That's how long the backup history should be,
  at minimum.

There's other criteria as well. For example, would you like to see, in
fifty years, how your files are laid out today? If so, you need a
fifty-year-old backup, plus perhaps a backup from each year, if you
want to compare how things were each year in between. With increasing
storage space and nice de-duplication features, this isn't quite as
expensive as it might be.

There is no one schedule that fits everyone's needs and wants. You have
to decide for yourself. That's why the default in Obnam is to keep
everything forever. It's not Obnam's duty to decide that you should
not keep this or that backup generation.