10 steps for optimal IT disaster recovery plan design
Have you ensured your disaster recovery planning is comprehensive and able to be understood by everyone involved? Our guidelines will help set you on the right path.
There are many items -- some location-specific and others more generic -- that need to be included in an IT disaster recovery plan. This article discusses the top 10 items critical to the success of your plan. Organizations may disagree about the priority of these items, but they must exist somewhere within the DR plan.
1. The IT disaster recovery plan must have an accurate communication list
Communicating with other employees is essential during a disaster. Some organizations use third-party vendor products, such as Everbridge, which can send out messages over email, text and social media, in addition to phone calls. Regardless of how you generate your call list, it must be current. The list should have designated backups for each key individual and multiple pieces of contact information for them. If you use a call tree, make sure it loops back so that the last person on the list confirms that the call was made. Also, someone should be designated as the communication list manager to monitor responses and contact backup staff as necessary.
The HR department is a good source for updated contact information. It's important to frequently check to see if there have been any changes in personnel. The emergency management team is another good resource in developing the contact list. Once your organization has a solid list, do a run-through so each person knows what to do when needed. You should also do a test if you're using an automated notification system.
2. Work off a detailed script during a disaster
When you are in recovery mode, many things occur at the same time, and this often leads to confusion. To make the DR process easier, include a detailed script or step-by-step instructions in your DR plan. Several members of the DR team should formally review the script. Because there is no guarantee the script will be followed by the same person who wrote it, it's best to use a simple bulleted list with easy-to-follow steps.
When in a real recovery situation, there will be intense pressure and many things going on at the same time. Also, a disaster can occur at any time, so if the plan is executed late at night, confusion is likely to be high. Whatever can be addressed in the plan to make the steps easier to follow will go a long way.
If possible, try to anticipate errors and include remediation steps. For example, a bulleted point can be as follows: "Connect network cable to PC. If network not found, first check if PC Ethernet connection shows signal." Straightforward and simple instructions go a long way when you are under extreme stress and fatigue. Avoid using terms that may not be understood when extreme fatigue hits. If you must use technical terms that may not be understood, add a glossary.
Be clear about employee roles and responsibilities. Perform a test of the script, and make changes as necessary. It's critical the script is accurate and up to date.
It may also be helpful to follow business continuity/disaster recovery use standards in developing your script, such as NIST SP 800-34.
3. Test and retest the IT disaster recovery plan
It is possible to test separate portions of the DR plan on their own, but the entire plan should be tested at least once a year or when a major change takes place.
If you exercise or test the DR plan at least once a quarter, then the staff will become more familiar with it. And to make it more effective, try to exercise the plan with different staff members when possible. When testing, don't assume everything will go according to plan. That is why it's important to test for unusual conditions, such as when a drive or a technical component fails.
Testing is a good way to ensure the IT disaster recovery plan remains updated. IT systems and personnel change often. In addition, an organization's recovery time objective and recovery point objective requirements fluctuate. A DR test gives you a solid indication of whether you can meet those important benchmarks.
DR testing used to be much more invasive. Now, it can be as simple as a couple clicks of a button to see if a cloud failover works. It can also be more involved, featuring lots of people. Just make sure you're doing some sort of testing on a frequent basis. Communicate what you'll be doing and when so anyone affected is in the loop. Write up an after-action report to detail how the test went and how the DR can be improved.
4. All DR team members should know their roles
Members of the backup team must be familiar with their roles. If a team member whose primary role is applications has a backup role as a telecommunications resource, make sure she knows what that role entails.
Your organization should have a designated DR team leader who is accessible, knows the IT system and the DR plan well, and can communicate effectively. Under the leader, team members should also have knowledge of the DR plan and show the ability to remain calm under pressure. Training is a good way to ensure everyone is on the same page.
5. Have a list of 24-hour resources at the recovery site
This next item may sound odd as a must-have, but it is of utmost importance. You will likely spend many hours at a recovery site and will need to replenish supplies, so knowing the location of the nearest hardware and office supply stores will prevent you from wasting precious minutes when you need these additional resources.
You have choices in setting up a DR site. If your organization has an internal site, you are responsible for setup and maintenance. If the site is external, an outside provider owns and operates it. External site options include a hot site that provides access to a fully functional data center, including customer data; a warm site that is equipped but does not have customer data; and a cold site that has infrastructure but no technology until recovery time.
6. Incorporate an application list in the DR plan
An application list is any software package or system that will be part of the recovery, and it should always appear in a master list. Each entry should have the application name as the technical staff identifies it, the name the business side recognizes and any technical details, such as a server name. Along with the technical items, include the application owner, full contact information and backup contacts.
7. Include a current diagram of the entire network and recovery site
Each node on the switch and panels should have some means of identification. You do not want to start following cables and wires through switches in the middle of your recovery efforts.
8. Include an easy-to-follow map and directions of how to reach the recovery site
Do not assume everyone knows how to get to the recovery site. Secondary directions should be provided in case the main route is congested or impassable. Available parking facilities should also be noted.
9. Provide additional documentation
This additional information includes a list of vendor contacts and insurance documentation, such as policy numbers. These items, as well as a list of all the hardware and software licenses you may have, are helpful to have when implementing your IT disaster recovery plan.
The risk assessment and business impact analysis -- typically performed at the beginning of your disaster recovery and business continuity plan work -- will provide you with much of the documentation and vital records you'll need.
10. Keep it current
The most critical issue regarding an IT disaster recovery plan is that it is current and that a backup copy exists at the recovery site. You should update the plan at least once a year or whenever modifications are made that require a change in the DR plan. These changes can be related to hardware, software, personnel or anything else that would modify the current DR environment.
You don't have to update the plan all at once. Rather, you should constantly review its many elements. You may find that certain sections need updating less frequently than others. A schedule for updating can help.
Additional IT disaster recovery plan best practices
Your final DR plan should be concise with easy-to-follow bullet points. It should be written so that someone with a similar skill level can accurately follow all the steps in the plan. Knowing this, any shorthand or technical jargon should be omitted from the instructions. In addition, it is important to realize that DR plans will probably be carried out under extreme conditions. When reviewing and editing the plan, ask yourself if it is simple and concise enough to follow under stressful situations.
You should also review SearchDisasterRecovery's free, downloadable template that will guide you in forming a concise, clear and comprehensive IT disaster recovery plan.