SaaS disaster recovery best practices for users and providers
When it comes to SaaS disaster recovery there are two main roles: the consumer and the provider. The side of the fence they're on defines their options and responsibilities.
The cloud has changed the game for backup and recovery admins, and software-as-a-service options are continuously on the rise. However, for the critical operation of disaster recovery, there is no room for confusion about who is responsible for what. Organizations must not only know what the SaaS provider can do, but what the organization itself is responsible for in the event of an outage.
The roles of the SaaS consumer and the provider must be clearly outlined by both sides at the start. Along with knowing what the provider is capable of, SaaS consumers must know what type of recovery they want, how much downtime they can tolerate and what they want to do on their end to meet these goals. The SaaS provider must consider its own limitations and plans when something goes awry on the user's end.
SaaS disaster recovery for the consumer
Consumers of SaaS should have a clear idea of what they are looking for in a provider and what is possible through those services. Any decent SaaS provider must provide some level of backup as part of their responsibilities to their clients, but they may not keep the same backup frequency an end user might find useful. Backup frequency must be included in the SLA documentation, so consumers should read all documentation carefully to understand their responsibilities and those of the provider.
With most major SaaS providers, it’s encouraged that consumers also conduct their own backups. Along with varying backup/retention policies among providers, if a file is accidentally or maliciously deleted it may be easier and faster to restore from in-house copies rather than making a support request. Being able to recover using the organization's own backups is both expedient and ensures it is as current as can be. In the event of a sudden and unexpected outage, backup and disaster recovery admins want to have every option available to them for a swift recovery.
Examples of self-managed backup capabilities include companies that offer the capabilities to back up managed email, such as the Microsoft 365 package of applications and data storage options. Most major SaaS providers do provide export options. Each vendor addresses it differently but does provide the options as needed.
While vendors may offer export of data, this doesn’t address the SaaS elephant in the room: Without the proprietary SaaS platform, the data alone serves no purpose. The consumers are at the mercy of the SaaS provider and its disaster recovery abilities.
That being said, some vendors, such as Zerto and Veeam, provide the ability to invoke disaster recovery and render a limited read-only version of the services in question for the duration of the outage. This type of backup service could be costly, but if an organization uses it properly, in conjunction with a proper business continuity plan, it can mean the difference between reduced levels of service and none at all.
DR considerations for SaaS providers
For the providers of bespoke SaaS services, the optics are slightly different. It is important to not only plan for loss of service, but to design for redundancy across multiple physical locations. If an organization does this, it can easily fail over to other region in an automated fashion without lengthy downtimes.
SaaS providers must design the infrastructure to be resilient and not rely on a single zone or region being available for failover purposes. Costs for multiple failover zones may be higher, but being able to initiate disaster recovery operations and complete a restore quickly will pay for itself after a single outage.
Above and beyond that, the ability to restore data at the granular level is critical. End users lose or somehow mangle their data frequently, so a SaaS provider should have contingencies ready for human error on the consumer side. Providers must make their responsibilities clear and outline those of the consumer from the start, in the SLA. This will avoid confusion or conflict between end user and provider in the event of an outage.