Edelweiss - Fotolia

Tip

Back up Kubernetes persistent volumes with snapshots

A snapshot is a useful tool to back up Kubernetes persistent volumes. Follow these steps to create a snapshot on GKE using YAML files and kubectl commands.

IT teams often use Kubernetes persistent volumes as a repository to store application data. Like any other type of application data, admins must regularly back up data in Kubernetes persistent volumes.

To do this, there are two main options: The first is to use a third-party backup application that supports Kubernetes; the second option is to use native Kubernetes tools to create snapshots of Kubernetes persistent volumes.

Below, we cover how to use snapshots to back up Kubernetes persistent volumes on Google Kubernetes Engine (GKE).

Kubernetes snapshots for backups

Google outlines a few basic requirements to take a snapshot on GKE. First, ensure the container storage interface driver supports snapshots. Second, use GKE version 1.17 or higher. Third, have a persistent volume claim -- or a user request for storage -- and make sure the persistent volume is managed by the CSI driver. The CSI driver makes the storage accessible through API calls, which enables the backup to interact with the storage.

The next step is to create a volume snapshot class. A volume snapshot class is based on a YAML file that tells the Kubernetes engine which driver file you want to use when creating snapshots. Here is an example YAML file from Google's documentation:

# snapshot-class-example.yaml

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: snapshot-class
driver: csi-driver
deletionPolicy: Delete

You'll need to make two changes to this file before you use it. First, in the name: field, replace snapshot-class with the name of the snapshot class you want to create. Then, in the driver: field, replace csi-driver with the name of the driver file that you want to use. For example, if you are using the Compute Engine Persistent Disk CSI Driver, then the driver name would be pd.csi.storage.gke.io.

The Compute Engine Persistent Disk CSI Driver is often used, but it isn't the only CSI driver. Many storage vendors provide their own CSI drivers. If you use one of these drivers, then you will need to specify the name of that driver rather than using the pd.csi.storage.gke.io driver.

Once you have prepared the YAML file, use the kubectl command to create the volume snapshot class. Again, replace snapshot-class-example with the name of the YAML file you created:

kubectl apply -f snapshot-class-example.yaml

Next, create a second YAML file. Unlike the first YAML file, which outlines the creation of the snapshot class, this one will define the parameters for the snapshot itself. This YAML file lists the API version, the type of snapshot, the snapshot name, the volume snapshot class name (as defined by the first YAML file) and the name of the persistent volume claim.

Here is an example of a snapshot YAML file from Google's documentation:  

#snapshot-example.yaml

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: snapshot
spec:
  volumeSnapshotClassName: snapshot-class
  source:
    persistentVolumeClaimName: pvc

Once again, there are a few things to change in the YAML file above before you can use it to create a snapshot. First, change the name of the snapshot to something more descriptive. For example, the new name might reference the application rather than the volume name, since that conveys the purpose of the data within the volume.

Next, replace snapshot-class with the name of the snapshot class you created earlier. Finally, replace pvc with the name of the persistent volume claim on which you will base the snapshot. To create snapshots of multiple persistent volume claims, create a separate YAML file for each. This is why it's important to choose a snapshot name that reflects the persistent volume being protected.

At this point, you are ready to create a snapshot. Issue the following command -- but, again, replace snapshot-example with the name of the YAML file you just created:

kubectl apply -f snapshot-example.yaml

Next Steps

How to restore a Kubernetes cluster from an etcd snapshot

Dig Deeper on Data backup and recovery software