As I wrote in Chapter 0, my personal opinion on why we do backups is, because we want (and have) to be able to do restores.
In other words: Define, which data you need to restore in which amount of time. When you did this in detail, it’s quite easy to define, how your backups have to be done in order to fulfill these restore requests.
Nearly the same question I ask customers that invited me to help them create a backup design: “Please tell me, how you think that a restore should work for you, which data you need to retrieve and how much time you’re willing and able to offer for this process. Please try not to use any technological explanations here, just tell me in your own words.”
After all the different opinions from all attendees are noted on a white board, I usually start explaining the different technologies available on the market to assure, everybody has the same level of knowledge. I also tell people about the advantages and disadvantages of these technologies. This way, it’s quite easy to get a first impression, which way the workshop will follow.
There are quite some point to talk about, before I can start to design a reliable backup (and restore) concept that meets (or exceeds) the company’s needs but is still manageable.
Some of these points are:
- The type of data to protect
- The amount of data to protect
- The percentage of data that changes frequently
- The existing infrastructure (network, storage, etc.)
- The timeframe available for restore
- The amount of time, the admin team can invest into backup
- Legal or compliance requirements
Sometimes, after writing all these requests on a whiteboard, so everyone in the room can read them, we find out, that they are mutually exclusive.
As an – grossly overstated – example: If someone tells me, he needs to protect five terabytes of data per day and the infrastructure team says, that the server’s network is based on 100 Mbit switches, I can only tell the customer that this will never work.
The next point I try to clarify is the question, what type of risk the company tries to protect against. Just to give you an example, what risks I get named most:
- accidentally (or willfully) deleted data
- Technical outages of all kind
- Virus infects
- Natural disasters
These risks have to be sorted and weighted, because some of them are more likely to happen than others. And the impact of each of them can differ according to the technology in place at the customer.
And, finally, the budget to acquire, implement and test the solution itself. especially the testing part is something, that gets underestimated very often. But tell me, how much can you rely on a technology that you’ve never tested and that you’ve never seen working? Compare it to a fire alarm. If you never had a fire drill, in case of an emergency, the risk is very high that no one knows, what to do at all.
Of course, at this point of the workshop, nobody is able to size a budget. But everyone in the room need to start accepting that backup is something you don’t get for next to nothing.
Continue reading here: Chapter 2 – Backup Methods