Arjuna Kodisinghe - stock.adobe.

Tip

13 reasons your disaster recovery plan failed

Preventing failure is an important goal for DR teams, but disasters do not conform to how plans are designed. Familiarity with the causes of failure can help bolster a DR plan.

Paul Kirvan

By

Paul Kirvan

Published: 11 Jul 2024

IT teams that implement a technology disaster recovery plan hope they never have to use it. However, never running a disaster recovery plan through a crisis can mean an untested strategy and the risk of the DR plan failing. No organization wants to face a disruption, but IT teams must not be caught by surprise if one does occur.

The unpredictable nature of threats like ransomware and natural disasters means they can strike at any moment, even if an organization hasn't dealt with them before. If a major disruption to IT infrastructure resources never occurs, the organization might not know for certain that its plan will work.

To adequately understand the importance of a tested disaster recovery plan, IT teams must know the causes of DR plan failure. In addition to guidance on constructing a DR plan, below you'll find 13 common reasons why a DR plan might fail and how to avoid them.

Importance of DR planning

While most IT organizations accept that a DR plan can help in an emergency, they can never be totally certain it will work as needed, or if the systems and people will perform as intended.

DR planning aims to ensure IT infrastructure elements -- including hardware, software, network services, environmental systems, physical security, cybersecurity, utility services and people -- are safe from a disruptive event. If properly protected by a DR strategy, these critical elements can subsequently return to previous operations.

In a data center, DR typically addresses multiple elements, including the following:

Backup, recovery, replacement and restoration of hardware devices.
Backup, recovery and restoration of network services.
Backup, recovery, retrieval and reinstatement of systems and data.
Recovery and restoration of physical facilities used by the data center.
Recovery and restoration of utility services, such as power and water.
Recovery of IT personnel and their return to their previous roles.

In practice, the above issues might be addressed by a single DR plan. IT teams can also develop individual plans for specific mission-critical resources. The former option describes how to restore IT operations at a high level, while individual plans go into the details of recovering, restarting, testing and validating resources before they return to production status.

In short, the high-level plan describes procedures to recover and restore IT operations, and assumes that the practical details will be addressed by subject matter experts within the IT department. In theory, this approach should work, unless the incident occurs outside the scenarios presented as part of the high-level DR plan.

What if the DR plan fails?

When building DR plans, it is important to take an all-hazards approach while considering potential disruptive events. This increases the likelihood that procedures described in plans will perform as needed -- or at least will help mitigate the severity of the incident.

But what if the above DR planning and recovery initiatives do not work as anticipated?

First, when developing plans, IT teams must consider the issue of DR plan failure. For example, suppose the strategy for protecting servers is to have an inventory of devices ready to replace damaged units. When was the last time the reserve servers were tested? If the backup servers do not work, for whatever reason, then recovery is jeopardized. The same goes for major business systems. If the backup app is not available, or cannot be obtained in a timely fashion, the organization's business -- and reputation -- might be adversely affected.

What can cause a DR plan failure?

Ideally, IT teams identify the risks and threats to important resources, as well as the impact to the business if those resources are disabled, in the plan development phase. Activities in this phase, such as risk assessments and business impact analyses, can provide essential data for plan development and help avoid potential failures. These analyses and assessments can also identify the priorities for resource recovery and restoration, enabling a smooth and orderly recovery.

Recognizing the above realities of DR plan development and execution, the following are 13 common reasons that a DR plan might fail. Each is an element of the overall DR planning process.

Lack of senior management support and funding. This is often the most important activity in the process, as lack of management support and funding can limit the development of DR plans. This can result in an organization not implementing a plan at all or having an incomplete plan.
Not involving the right people in the planning process. The DR team typically includes technology staff and should also include employees charged with overall responsibility for the DR process. Third-party experts might also be part of the team.
Tech issues. Technology problems, such as software issues or insufficient backups, are a common reason why a DR plan failed. IT teams must conduct sufficient research and analysis to determine the most cost-effective fixes to technology recovery issues. They should also know when these elements require an update or replacement.
Failure to regularly test plans. Testing is a critical activity because it validates that the procedures defined in the plan will work as intended. It also identifies potential failure points before they can affect a real recovery.
Failure to conduct a post-test review and update the plan based on the test. Once a test is complete, the next step is to review what worked and what did not work. IT teams must update plans to reflect the lessons learned and, if possible, perform follow-up tests to validate the changes.
Not communicating the plan throughout the organization. Employees must be aware that programs exist to ensure the uninterrupted operation of the IT resources they use and know what they should do when an incident occurs.
Insufficient DR team training. Knowledge of how to recover and restore disrupted resources -- whether internally or externally implemented -- must be communicated and regularly reinforced through training to ensure DR teams are prepared to respond in an emergency.
Lack of employee training for a DR event. In addition to making employees aware of DR activities, periodic training is recommended so that employees will know what to do if a technology disruption occurs.
System changes that are not reflected in a revised DR plan. Whenever changes to mission-critical systems and resources occur, they must be reflected in DR plans, especially if procedures for recovery and restoration change.
Lack of regular patching of mission-critical systems. Failure to keep up on patching can result in unintended system disruptions. For example, not installing cybersecurity system patches can result in undetected malware attacks.
Failure to include DR activities in IT staff meetings. If DR is not a regular activity, it can be easily forgotten. A DR agenda item in IT staff meetings is advisable.
Failure to review and assess the plan and its associated activities. In addition to live system testing, it is good practice to periodically review and assess DR plans -- of all types -- to ensure they are up to date and actionable.
Failure to determine what constitutes a "failed" plan. It is important to determine what failure is for DR planning so that the key elements are properly addressed and the plan is regularly tested.

Executing the above steps can help reduce the likelihood that DR plans will fail when an emergency occurs.

Paul Kirvan is an independent consultant, IT auditor, technical writer, editor and educator. He has more than 25 years of experience in business continuity, disaster recovery, security, enterprise risk management, telecom and IT auditing.

Dig Deeper on Disaster recovery planning and management

Part of: Why disaster recovery is not optional for business

Up Next

8 reasons businesses have no DR plan (and why they're wrong)

In today's interconnected world, disaster recovery planning is not a luxury but a necessity for safeguarding the future of any business. Don't make the mistake of being unprepared.

13 reasons your disaster recovery plan failed

Preventing failure is an important goal for DR teams, but disasters do not conform to how plans are designed. Familiarity with the causes of failure can help bolster a DR plan.

6 benefits that make a disaster recovery plan worth it

Disaster recovery plans help reduce recovery time, ultimately saving time and money. Time spent testing and planning upfront pays dividends when a disaster strikes.

Search Data Backup

Ways to protect data platforms from turnover risk
Departing data employees can take valuable Institutional knowledge. Protect it with consistent documentation, a central ...
Backup environment evaluation: Verify before you buy
As backups move toward integrated data protection, this guide explains why recovery speed, environment‑wide visibility and ...
Choose an enterprise backup architecture that fits risk
Weigh the trade‑offs among on‑premises, backup as a service and hybrid backup, and use a clear framework to choose the approach ...

Search Storage

HoloMem demos backward-compatible holographic storage
Holographic data storage has the potential to compete with existing magnetic tape drives for cold storage, and one company has a ...
AI, flash highlighted in 2025 data storage conference lineup
The 2025 storage conference calendar featured shows where vendors released major product updates and experts discussed top trends...
How Chiplets will Accelerate Storage
Chiplets are a newer approach to chips in processors, where a smaller collection of chips is packaged together to emulate a ...

Search Security

Secure MCP servers to safeguard AI and corporate data
Model Context Protocol servers act as bridges between AI models and enterprise resources. But they can also give threat actors ...
Why organizations need cloud attack surface management
Cloud environments constantly change, expanding attack surfaces beyond traditional tools. Cloud ASM delivers continuous ...
News brief: Ransomware trends show new twists to old game
Check out the latest security news from the Informa TechTarget team.

Search CIO

Inside a CIO's mind: Mastering time and knowing the business
CIO Sean McCormack explains how he balances strategy, vendors and frontline engagement -- and why his to-do list lives on his ...
CIOs are feeling the pressure of the AI leadership gap
In this Q&A, Wendy Lynch, founder of Analytic Translator, discusses how CIOs need to close a leadership gap to overcome the huge ...
Why companies should be sustainable and how IT can help
Pressure is mounting for the business sector to address its environmental footprint and become more sustainable. Here's a look at...

Close