Relax, this is only a test. That’s what they say every year. But you know different. Just because it is a simulated disaster doesn’t mean you won’t be stressed and tired.
Disaster recovery tests are something a lot of medium to large businesses conduct to prove that they can restore their business critical information services, applications, databases, etc. in the event that one or more of their data centers is partially or wholly taken off line by some unplanned event.
A recovery test typically involves shipping backup tapes to another site, normally a considerable distance away from the production data center, where blank servers, network equipment, disk storage hardware, and various other components await. Those systems are loaded with the production backups and then brought on line for users to test and verify that the company’s applications and datasets are all on-line, available, and functioning properly.
I’ve made it sound simple, but if you consider the complexities of numerous different operating systems (Microsoft, Unix, iSeries, etc.) and the various components residing on different hosts (database over here, application executables over there, web front ends out that-a-way) you begin to see the massive amount of communication and coordination that needs to be achieved to restore and rebuild systems that were typically originally installed and configured in processes that took weeks or even months.
When someone tells you you have to get a few hundred of those components recovered and working at a site several hundred miles away, and you have less than 72 hours to do it, you develop a keen sense of urgency.
I’ve been watching a small army of system admins and engineers do this stuff for my fourteenth annual DR test in a row. At one time or another I’ve been involved in practically every discipline of it, from building desktop systems for test users to configuring network devices to restoring NetWare, Microsoft, Linux and Unix systems.
The first few years I even had to configure stacked Token Ring switches and FDDI backbones, two topologies that are rusting away in the distant past of networking technology.
These days I just yak on the conference bridge, twiddle my thumbs as the tapes spin billions of ones and zeros out onto disk platters, and bellyache when some poor technical staff member who has been up for thirty hours straight doesn’t immediately respond when we call him or her. Disaster recovery exercises are not quite as tough on management’s sleep patterns as it is for the technicians, but it is still a huge dose of stress. Failure is a big sign that you can’t be trusted to get the company back into money-making mode fast enough if the corporate headquarters gets hit by a tornado or shut down because someone discovered an Orc infestation in the basement.
Yeah, that’s the way I’m spending this weekend.
I’d much rather be washing dishes or mowing the lawn or clearing a clog out of a sewer line.