tiero - Fotolia

Tip

The reality of Docker disaster recovery in practice

A successful Docker DR strategy must account for a range of factors, including the host infrastructure, networks and redundancy in the cloud.

One of the most touted features of Docker is the instant creation and destruction of containers. If one container dies, a new one can immediately replace it.

That easy in and out, however, tends to breed a false sense of security when it comes to the setup and use of a Docker disaster recovery environment.

Docker images may be rapid to deploy, but, as with most other technologies, the underlying hosts are tightly intertwined with other infrastructure components. There are some challenges administrators must consider when using Docker in a disaster recovery scenario.

Key considerations include:

1. The host infrastructure

Every Docker container needs to run on a host. Developers must ensure they can quickly spin up a replacement host. Those hosts must have very specific, standardized builds to ensure consistency. A disaster recovery scenario is not ideal for untested configurations, because they need to be exact. Reducing downtime in a disaster is critical, so it's not the time to risk that a configuration isn't right.

2. The stateful servers

Docker containers are easy to replace, but the data that resides on the non-Docker persistent VM servers will need to be available in disaster recovery. The database servers must be readily available, along with other items such as load balancers, middleware systems and authentication servers. Administrators must include and account for these items in a disaster recovery plan.

Docker disaster recovery issues come in several forms. One method of mitigation is to build geo-redundancy into the overall cloud design.

3. Networking

Networking is critical and can be a nightmare to fix in a disaster. Enterprise network diagrams should show the interdependencies of organizational activities, but recreating those in a DR scenario where time is money is a bad process. Do all the Docker disaster recovery configuration and tests upfront.

4. Correctly configured VPN access

Even if the replacement infrastructure could be spun up promptly, accessing that new infrastructure could be problematic. For example, if a company accesses its application via a site-to-site VPN, it would have to reconfigure everything to allow access and avoid firewall issues such as locking the wrong people out.

How to avoid Docker DR pitfalls

Docker disaster recovery issues come in several forms. One method of mitigation is to build geo-redundancy into the overall cloud design.

If an organization is at the start of its Docker and cloud design journey, it should ensure the application is spread across multiple public cloud regions, irrespective of provider.

That way, should one location become inaccessible, the resources are still available. While a geo-replication strategy may be more costly, this best practice reduces downtime. A DR plan must note how much downtime the organization can handle and any clients with hefty downtime penalties in their service-level agreements.

For those without geo-redundant cloud, most providers will be able to offer in-cloud DR failover.

For companies and developers using Docker in a hybrid cloud or private cloud, the issue can be more complex, but not insurmountable. A well-documented network diagram and DR plan will help.

If the environment is virtual, administrators can fail over the application, Docker hosts, databases, auditing servers and authentication control as one consistent group. Private and hybrid cloud environments require that the organization set up all failover details ahead of time. It may sound suboptimal, but if the administrator needs to fail over the application, critical data such as the IP, virtual LAN and routing information are there and ready to go.

DR administrators must test these setups in a consistent and timely manner to ensure the following:

  • all required resources are included in the protection group;
  • the application works as expected with no incorrect setup in the failover configuration; and
  • access from the user base to the application and application to third-party providers is working and accessible.

Keep developers in the DR loop

Many developers don't get involved in infrastructure. They are paid to code. However, that way of thinking can backfire. This is why developer education is critical.

Help developers understand the back-end infrastructure so they know how and where critical issues will occur. Demonstrate how they can work with the administrators to minimize issues they may face in a disaster. Also, enable them to write code to simplify any future disaster recovery scenario.

Dig Deeper on Disaster recovery planning and management