summaryrefslogtreecommitdiff
path: root/manual/fr/020-concepts.mdwn
blob: 2efc8c59194a6c0b3fe50ea71d1e486ab41695c0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
Vous savez que vous devriez
===========================

Ce chapitre est philosophique et théorique à propos des backups. Il
traite des raisons pour lesquelles vous devriez faire des sauvegardes,
les concepts fondamentaux, ce à quoi vous devez penser lorsque vous
mettez en place vos sauvegardes et que faire au long terme
(vérification, etc). Il mentionne également quelques hypothèses que fait
Obnam et les contraintes que cela vous impose.

Pourquoi sauvegarder ?
----------------------

FIXME : Ajouter quelques histoires horrible ici qui prouvent que les
sauvegardes sont importantes.
Avec rérefences/liens.

Les concepts des sauvegardes
----------------------------

Cette section couvre les concepts principaux des sauvegardes, et
introduit la terminologie utilisée dans ce manuel.

Les **données actives** sont les données sur lesquelles vous travaillez
ou que vous conservez. Ce sont les fichiers sur votre disque dur : les
documents que vous écrivez, les photos que vous enregistrez, les romans
non terminés que vous espérer finir.

La plupart des données actives sont **précieuses**, c'est à dire que vous seriez contrarié si elles disparaissaient. Certains fichiers actifs ne sont pas précieux : le cache de votre navigateur web, par exemple. Cette distinction peut vous permettre de limiter la quantité de données que vous devez sauvegarder, ce qui diminue de manière significative le coût de vos sauvegardes.

Une **sauvegarde** est une copie supplémentaire de vos données actives.
Si vous perdez une partie ou la totalité de vos fichiers actifs, vous
pouvez les récupérer (**restaurer**) depuis vos sauvegardes. Cette copie
de sauvegarde est, en pratique, plus vieille que vos données actives,
mais si vous sauvegardez régulièrement, vous ne devriez jamais perdre
grand chose.

Des fois il est utile d'avoir plus qu'une seule copie de sauvegarde de
vos données actives. Vous pouvez avoir une série de sauvegardes, faites
à des moments différents, vous offrant un **historique de sauvegardes**.
Chaque copie de vos données dans votre historique est une **génération**.
Ceci vous permet de récupérer un fichier que vous avez supprimé il y a
un certain temps, mais sans que vous le remarquiez jusqu'à présent. Si
vous n'avez qu'une copie de sauvegarde, vous ne pouvez plus le
récupérer, mais si vous avez par exemple une sauvegarde par jour sur un
mois passé, vous avez un mois pour réaliser qu'il vous manque un fichier
avec qu'il soit perdu pour toujours.

L'emplacement où vos sauvegardes sont stockées est le **dépôt de
sauvegarde** (ndt : "backup repository"). Vous pouvez utiliser de
nombreux types de **media de sauvegarde** pour le stocker : disque dur,
disque option (DVD-R, CD-R, etc), clé USB, espace en ligne, etc. Chaque
type de medium a différentes caractéristiques : taille, vitesse,
commodité, fiabilité, prix, que vous devez prendre en compte pour
choisir la solution la plus raisonnable pour vous.

Il est possible que vous ayez besoin de plusieurs dépôts ou media, l'un
d'entre eux situé **hors-site**, loin de votre machines habituelles.
Sinon, si votre maison brûle, vous perdez vos sauvegardes avec.

Vous devez **vérifier** que vos sauvegardes marchent. Ce serait dommage
de faire l'effort de sauvegarder régulièrement et ensuite de ne pas
pouvoir les restaurer lorsque vous en avez besoin. Vous devriez sans
doute tester votre **plan après sinistre** en imaginant que vos fichiers
ont disparu, à l'exception de vos media de sauvegarde. Parvenez-vous à
récupérer ? Vous devriez sans doute faire ceci régulièrement, pour être
certain que votre solution de sauvegarde marche toujours.

Il y a un grande variété d'**outils de sauvegarde**. Ils peuvent être
simples et manuels : copiez vos fichier sur une clé USB en utilisant
votre explorateur de fichiers, tous les trente-six du mois. Ils peuvent
aussi être très complexes : les produits de sauvegarde conçus pour les
entreprises qui coûte énormément d'argent et incluent plusieurs jours de
de formation pour votre IT, et qui nécessite une attention constante de
la part de cette équipe.

Vous devez définir une **stratégie de sauvegarde** pour tout relier
ensemble : quelles données active sauvegarder, sur quel medium, en
utilisant quel outils, quel historique conserver et comment vérifier que
tout marche.

Backup strategies
-----------------

You've set up a backup repository, and you have been backing up to
it every day for a month now: your backup history is getting long
enough to be useful. Can you be happy now?

Welcome to the world of threat modelling. Backups are about
insurance, of mitigating small and large disasters, but disasters
can strike backups as well. When are you so safe that no disaster
will harm you?

There is always a bigger disaster waiting to happen. If you backup
to a USB drive on your work desk, and someone breaks in and steals
both your computer and the USB drive, the backups did you no good.

You fix that by having two USB drives, and you keep one with your
computer and the other in a bank vault. That's pretty safe, unless
there's an earth quake that destroys both your home and the bank.

You fix that by renting online storage space from another country.
That's quite good, except there's a bug in the operating system
that you use, which happens to be the same operating system the
storage provider uses, and hackers happen to break into both your
and their systems, wiping all files.

You fix that by hiring a 3D printer that prints slabs of concrete on
which your data is encoded using QR codes. You're safe until there's a
meteorite hits Earth and destroys the entire civilisation.

You fix that by sending out satellites with copies of your data,
into stable orbits around all nine planets (Pluto is too a planet!)
in the solar system. Your data is safe, even though you yourself
are dead from the meteorite, until the Sun goes supernova and
destroys everything in the system.

There is always a bigger disaster. You have to decide which
ones are likely enough that you want to consider them, and also
decide what the acceptable costs are for protecting against them.

A short list of scenarios for thinking about threats:

* What if you lose your computer?
* What if you lose your home and all of its contents?
* What if the area in which you live is destroyed?
* What if you have to flee your country?

These questions do not cover everything, but they're a start. For each
one, think about:

* Can you live with your loss of data? If you don't restore your
  data, does it cause a loss of memories, or some inconvenience in your
  daily life, or will it make it nearly impossible to go back to living
  and working normally? What data do you care most about?

* How much is it worth to you to get your data back, and how fast do
  you want that to happen? How much are you willing to invest money
  and effort to do the initial backup, and to continue backing up
  over time? And for restores, how much are you willing to pay for
  that? Is it better for you to spend less on backups, even if that
  makes restores slower, more expensive, and more effort? Or is the
  inverse true?

The threat modelling here is about safety against accidents and
natural disasters. Threat modelling against attacks and enemies
is similar, but also different, and will be the topic of the
next episode in the adventures of Bac-Kup.

Backups and security
--------------------

You're not the only one who cares about your data. A variety of
governments, corporations, criminals, and overly curious snoopers are
probably also interested. (It's sometimes hard to tell them apart.)
They might be interested to find evidence against you, blackmail you,
or just curious about what you're talking about with your other
friends.

They might be interested in your data from a statistical point of view,
and don't particularly care about you specifically.  Or they might be
interested only in you.

Instead of reading your files and e-mail, or looking at your photos and
videos, they might be interested in preventing your access to them,
or to destroy your data. They might even want to corrupt your data,
perhaps by planting child porn in your photo archive.

You protect your computer as well as you can to prevent these and other
bad things from happening. You need to protect your backups with equal
care.

If you back up to a USB drive, you should probably make the drive be
encrypted. Likewise, if you back up to online storage.  There are many
forms of encryption, and I'm unqualified to give advice on this, but any
of the common, modern ones should suffice except for quite determined
attackers.

Instead of, or in addition to, encryption, you could ensure the physical
security of your backup storage. Keep the USB drive in a safe, perhaps,
or a safe deposit box.

The multiple backups you need to protect yourself against earthquakes,
floods, and roving gangs of tricycle-riding clowns, are also useful
against attackers. They might corrupt your live data, and the backups at
your home, but probably won't be able to touch the USB drive encased in
concrete and buried in the ground at a secret place only you know about.

The other side of the coin is that you might want to, or need to, ensure
others do have access to your backed up data. For example, if the clown
gang kidnaps you, your spouse might need access to your backups to be
able to contact your MI6 handler to ask them to rescue you. Arranging
safe access to (some) backups is an interesting problem to which there
are various solutions. You could give your spouse the encryption passphrase,
or give the passphrase to a trusted friend or your lawyer. You could also
use something like [libgfshare] to escrow encryption keys more safely.

[libgfshare]: http://www.digital-scurf.org/software/libgfshare

Backup storage media considerations
-----------------------------------

This section discusses possibilities for backup storage media, and
their various characteristics, and how to choose the suitable one
for oneself.

There are a lot of different possible storage media. Perhaps the most
important ones are:

* Magnetic tapes of various kinds.
* Hard drives: internal vs external, spinning magnetic surfaces vs
  SSDs vs memory sticks.
* Optical disks: CD, DVD, Blu-ray.
* Online storage of various kinds.
* Paper.

We'll skip more exotic or unusual forms, such as microfilm.

**Magnetic tapes** are traditionally probably the most common form of
backup storage. They can be cheap per gigabyte, but tend to require a
fairly hefty initial investment in the tape drive. Much backup
terminology comes from tape drives: full backup vs incremental backup,
especially. Obnam doesn't support tape drives at all.

**Hard drives** are a common modern alternative to tapes, especially for
those who do not wish pay for a tape drive. Hard drives have the
benefit of every bit of backup being accessible at the same speed as any
other bit, making finding a particular old file easier and faster.
This also enables **snapshot backups**, which is the model Obnam uses.

Different types of hard drives have different characteristics for
reliability, speed, and price, and they may fluctuate fairly quickly
from week to week and year to year. We won't go into detailed
comparisons of all the options. From Obnam's point of view, anything
that can look like a hard drive (spinning rust, SSD, USB flash memory
stick, or online storage) is usable for storing backups, as long as
it is re-writable.

**Optical disks**, particularly the kind that are write-once and can't
be updated, can be used for backup storage, but they tend to be best
for full backups that are stored for long periods of time, perhaps
archived permanently, rather than for a actively used backup
repository. Alternatively, they can be used as a kind of tape backup,
where each tape is only ever used once. Obnam does not support optical
drives as backup storage.

**Paper** likewise works better for archival purposes, and only for
fairly small amounts of data. However, a backup printed on good paper
with archival ink can last decades, even centuries, and is a good
option for small, but very precious data. As an example, personal
financial records, secret encryption keys, and love letters from your
spouse. These can be printed either normally (preferably in a font
that is easy to OCR), or using two-dimensional barcode (e.g, QR).
Obnam doesn't support these, either.

Obnam only works with hard drives, and anything that can simulate a
read/writable hard drive, such as online storage. By amazing
co-incidence, this seems to be sufficient for most people.

Glossary
--------

* **backup**: a separate, safe copy of your live data that will remain
  intact even if the primary copy gets destroyed, deleted, or wrongly
  modified
* **corruption**: unwanted modification to (backup) data
* **disaster recovery**: what you do when something goes wrong
* **full backup**: a fresh backup of all precious live data
* **generation**: a backup in a series of backups of the same live
  data, to give historical insight
* **history**: all the backup generations
* **incremental backup**: a backup of any changes (new files, modified
  files, deletions) compared to a previous backup generation (either
  the previous full backup, or the previous incremental backup);  usually, you can't remove a full backup without removing all of the
  incremental backups that depend on it
* **live data**: all the data you have
* **local backup**: a backup repository stored physically close to the
  live data
* **media**, **backup media**, **storage media**: where a backup
  repository is stored
* **off-site backup**: a backup repository stored physically far away
  from the live data
* **precious data**: all the data you care about; cf. live data
* **repository**: the location where backup data is stored
* **restore**: retrieving data from a backup repository
* **root**, **backup root**: a directory that is to be backed up,
  including all files in it, and all its subdirectories
* **snapshot backup**: an alternative to full/incremental backups,
  where every backup generation is effectively a full backup of all
  the precious live data, and can be restored and removed as easily as
  any other generation
* **strategy**, **backup strategy**: a plan for how to make sure your
  data is safe even if the dinosaurs return in space ships to re-take
  world now that the ice age is over
* **verification**: making sure a backup system works and that data
  actually can be restored from backups and that the backups have not
  become corrupted