From 2dee685a1f8fb954fbeb9fd9a9d0dbb57b34b8ee Mon Sep 17 00:00:00 2001 From: Lars Wirzenius Date: Sat, 29 Mar 2014 11:43:45 +0000 Subject: Move English manual texts to en subdir --- manual/en/020-concepts.mdwn | 293 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 293 insertions(+) create mode 100644 manual/en/020-concepts.mdwn (limited to 'manual/en/020-concepts.mdwn') diff --git a/manual/en/020-concepts.mdwn b/manual/en/020-concepts.mdwn new file mode 100644 index 00000000..bbd22878 --- /dev/null +++ b/manual/en/020-concepts.mdwn @@ -0,0 +1,293 @@ +You know you should +=================== + +This chapter is philosophical and theoretical about backups. +It discusses why you should back up, various concepts around backups, +what kinds of things you should think about when setting up backups +and what to do in the long term (verification, etc). It also discusses +some assumptions Obnam makes and some constraints it imposes. + +Why backup? +----------- + +FIXME: Add some horror stories here about why backups are important. +With references/links. + +Backup concepts +--------------- + +This section covers core concepts in backups, and defines some +terminology used in this book. + +**Live data** is the data you work with or keep. It's the files on +your hard drive: the documents you write, the photos you save, the +unfinished novels you wish you'd finish. + +Most live data is **precious** in that you'll be upset if you lose it. +Some live data is not precious: your web browser cache probably isn't, +for example. This distinction can let you limit the amount of data +you need to back up, which can significantly reduce your backup costs. + +A **backup** is a spare copy of your live data. If you lose some +or all of your live data, you can get it back ("**restore**") from +your backup. The backup copy is, by practical necessity, older than +your live data, but if you made the backup recently enough, you won't +lose much. + +Sometimes it's useful to have more than one old backup copy of your +live data. You can have a sequence of backups, made at different +times, giving you a **backup history**. Each copy of your live data +in your backup history is a **generation**. This lets you retrieve a +file you deleted a long time ago, but didn't realise you needed until +now. If you only keep one backup version, you can't get it back, +but if you keep, say, a daily backup for a month, you have a month +to realise you need it, before it's lost forever. + +The place your backups are stored is the **backup repository**. You can +use many kinds of **backup media** for backup storage: hard drives, +tapes, optical disks (DVD-R, DVD-RW, etc), USB flash drives, online +storage, etc. Each type of medium has different characteristics: +size, speed, convenicence, reliability, price, which you'll need to +balance for a backup solution that's reasonable for you. + +You may need multiple backup repositories or media, with one of +them located **off-site**, away from where your computers normally +live. Otherwise, if you house burns down, you'll lose all your +backups too. + +You need to **verify** that your backups work. It would be awkward to +go to the effort and expense of making backups and then not be able +to restore your data when you need to. You may even want to test +your **disaster recovery** by pretending that all your computer stuff +is gone, except for the backup media. Can you still recover? You'll +want to do this periodically, to make sure your backup system keeps +working. + +There is a very large variety of **backup tools**. They can be +very simple and manual: you can copy files to a USB drive using +your file manager, once a blue moon. They can also be very complex: +enterprise backup products that cost huge amounts of money and come +with a multi-day training package for your sysadmin team, and which +require that team to function properly. + +You'll need to define a **backup strategy** to tie everything +together: what live data to back up, to what medium, using what +tools, what kind of backup history to keep, and how to verify +that they work. + +Backup strategies +----------------- + +You've set up a backup repository, and you have been backing up to +it every day for a month now: your backup history is getting long +enough to be useful. Can you be happy now? + +Welcome to the world of threat modelling. Backups are about +insurance, of mitigating small and large disasters, but disasters +can strike backups as well. When are you so safe you no disaster +will harm you? + +There is always a bigger disaster waiting to happen. If you backup +to a USB drive on your work desk, and someone breaks in and steals +both your computer and the USB drive, the backups did you no good. + +You fix that by having two USB drives, and you keep one with your +computer and the other in a bank vault. That's pretty safe, unless +there's an earth quake that destroys both your home and the bank. + +You fix that by renting online storage space from another country. +That's quite good, except there's a bug in the operating system +that you use, which happens to be the same operating system the +storage provider uses, and hackers happen to break into both your +and their systems, wiping all files. + +You fix that by hiring a 3D printer that prints slabs of concrete on +which your data is encoded using QR codes. You're safe until there's a +meteorite hits Earth and destroys the entire civilisation. + +You fix that by sending out satellites with copies of your data, +into stable orbits around all nine planets (Pluto is too a planet!) +in the solar system. Your data is safe, even though you yourself +are dead from the meteorite, until the Sun goes supernova and +destroys everything in the system. + +There is always a bigger disaster. You have to decide which +ones are likely enough that you want to consider them, and also +decide what the acceptable costs are for protecting against them. + +A short list of scenarios for thinking about threats: + +* What if you lose your computer? +* What if you lose your home and all of its contents? +* What if the area in which you live is destroyed? +* What if you have to flee your country? + +These questions do not cover everything, but they're a start. For each +one, think about: + +* Can you live with your loss of data? If you don't restore your + data, does it cause a loss of memories, or some inconvenience in your + daily life, or will it make it nearly impossible to go back to living + and working normally? What data do you care most about? + +* How much is it worth to you to get your data back, and how fast do + you want that to happen? How much are you willing to invest money + and effort to do the initial backup, and to continue backing up + over time? And for restores, how much are you willing to pay for + that? Is it better for you to spend less on backups, even if that + makes restores slower, more expensive, and more effort? Or is the + inverse true? + +The threat modelling here is about safety against accidents and +natural disasters. Threat modelling against attacks and enemies +is similar, but also different, and will be the topic of the +next episode in the adventures of Bac-Kup. + +Backups and security +-------------------- + +You're not the only one who cares about your data. A variety of +governments, corporations, criminals, and overly curious snoopers are +probably also interested. (It's sometimes hard to tell them apart.) +They might be interested to find evidence against you, blackmail you, +or just curious about what you're talking about with your other +friends. + +They might be interested in your data from a statistical point of view, +and don't particularly care about you specifically. Or they might be +interested only in you. + +Instead of reading your files and e-mail, or looking at your photos and +videos, they might be interested in preventing your access to them, +or to destroy your data. They might even want to corrupt your data, +perhaps by planting child porn in your photo archive. + +You protect your computer as well as you can to prevent these and other +bad things from happening. You need to protect your backups with equal +care. + +If you back up to a USB drive, you should probably make the drive be +encrypted. Likewise, if you back up to online storage. There are many +forms of encryption, and I'm unqualified to give advice on this, but any +of the common, modern ones should suffice except for quite determined +attackers. + +Instead of, or in addition to, encryption, you could ensure the physical +security of your backup storage. Keep the USB drive in a safe, perhaps, +or a safe deposit box. + +The multiple backups you need to protect yourself against earthquakes, +floods, and roving gangs of tricycle-riding clowns, are also useful +against attackers. They might corrupt your live data, and the backups at +your home, but probably won't be able to touch the USB drive encased in +concrete and buried in the ground at a secret place only you know about. + +The other side of the coin is that you might want to, or need to, ensure +others do have access to your backed up data. For example, if the clown +gang kidnaps you, your spouse might need access to your backups to be +able to contact your MI6 handler to ask them to rescue you. Arranging +safe access to (some) backups is an interesting problem to which there +are various solutions. You could give your spouse the encryption passphrase, +or give the passphrase to a trusted friend or your lawyer. You could also +use something like [libgfshare] to escrow encryption keys more safely. + +[libgfshare]: http://www.digital-scurf.org/software/libgfshare + +Backup storage media considerations +----------------------------------- + +This section discusses possibilities for backup storage media, and +their various characteristics, and how to choose the suitable one +for oneself. + +There are a lot of different possible storage media. Perhaps the most +important ones are: + +* Magnetic tapes of various kinds. +* Hard drives: internal vs external, spinning magnetic surfaces vs + SSDs vs memory sticks. +* Optical disks: CD, DVD, Blu-ray. +* Online storage of various kinds. +* Paper. + +We'll skip more exotic or unusual forms, such as microfilm. + +**Magnetic tapes** are traditionally probably the most common form of +backup storage. They can be cheap per gigabyte, but tend to require a +fairly hefty initial investment in the tape drive. Much backup +terminology comes from tape drives: full backup vs incremental backup, +especially. Obnam doesn't support tape drives at all. + +**Hard drives** are a common modern alternative to tapes, especially for +those who do not wish pay for a tape drive. Hard drives have the +benefit of every bit of backup being accessible at the same speed as any +other bit, making finding a particular old file easier and faster. +This also enables **snapshot backups**, which is the model Obnam uses. + +Different types of hard drives have different characteristics for +reliability, speed, and price, and they may fluctuate fairly quickly +from week to week and year to year. We won't go into detailed +comparisons of all the options. From Obnam's point of view, anything +that can look like a hard drive (spinning rust, SSD, USB flash memory +stick, or online storage) is useable for storing backups, as long as +it is re-writeable. + +**Optical disks**, particularly the kind that are write-once and can't +be updated, can be used for backup storage, but they tend to be best +for full backups that are stored for long periods of time, perhaps +archived permanently, rather than for a actively used backup +repository. Alternatively, they can be used as a kind of tape backup, +where each tape is only ever used once. Obnam does not support optical +drives as backup storage. + +**Paper** likewise works better for archival purposes, and only for +fairly small amounts of data. However, a backup printed on good paper +with archival ink can last decades, even centuries, and is a good +option for small, but very precious data. As an example, personal +financial records, secret encryption keys, and love letters from your +spouse. These can be printed either normally (preferably in a font +that is easy to OCR), or using two-dimensional barcode (e.g, QR). +Obnam doesn't support these, either. + +Obnam only works with hard drives, and anything that can simulate a +read/writeable hard drive, such as online storage. By amazing +co-incidence, this seems to be sufficient for most people. + +Glossary +-------- + +* **backup**: a separate, safe copy of your live data that will remain + intact even if the primary copy gets destroyed, deleted, or wrongly + modified +* **corruption**: unwanted modification to (backup) data +* **disaster recovery**: what you do when something goes wrong +* **full backup**: a fresh backup of all precious live data +* **generation**: a backup in a series of backups of the same live + data, to give historical insight +* **history**: all the backup generations +* **incremental backup**: a backup of any changes (new files, modified + files, deletions) compared to a previous backup generation (either + the previous full backup, or the previous incremental backup); usually, you can't remove a full backup without removing all of the + incremental backups that depend on it +* **live data**: all the data you have +* **local backup**: a backup repository stored physically close to the + live data +* **media**, **backup media**, **storage media**: where a backup + repository is stored +* **off-site backup**: a backup repository stored physically far away + from the live data +* **precious data**: all the data you care about; cf. live data +* **repository**: the location where are backups are stored +* **restore**: retriving data from a backup repository +* **root**, **backup root**: a directory that is to be backed up, + including all files in it, and all its subdirectories +* **snapshot backup**: an alternative to full/incremental backups, + where every backup generation is effectively a full backup of all + the precious live data, and can be restored and removed as easily as + any other generation +* **strategy**, **backup strategy**: a plan for how to make sure your + data is safe even if the dinosaurs return in space ships to re-take + world now that the ice age is over +* **verification**: making sure a backup system works and that data + actually an be restored from backups and that the backups have not + become corrupted -- cgit v1.2.1