Jag_cz - stock.adobe.com

Configure Kubernetes garbage collection to get rid of waste

Kubernetes garbage collection is an important task for cluster health. Learn how to configure garbage collections to your own preference to effectively manage deployments.

Garbage collection maintains the health of Kubernetes clusters. Admins should use garbage collection configuration best practices and conduct the proper research to discern they should use the default, automatic process to clean workloads or manually configure garbage collection settings themselves.

The highest-level component of Kubernetes is a cluster where nodes are grouped together to run pods. The underlying container images change as new application versions are deployed. Kubernetes terminates older pods as the pods that contain new application versions are deployed.

Garbage collection refers to mechanisms that clean up cluster resources for Kubernetes, which is important for the health of a cluster. Garbage collection can clean up resources such as terminated pods, completed Jobs, unused containers and more.

Garbage collection configurations

There are several configurations that can help organizations gain control over garbage collection.

Metadata fields

Kubernetes uses a metadata field called ownerReference to keep track of which resources belong to other higher-level resources. Then, Kubernetes can clean up all owned resources after a parent resource is deleted.

For example, a deployment owns a ReplicaSet which owns the pods in that ReplicaSet. Therefore, when a deployment is deleted by an admin or another tool outside of K8s, the ReplicaSet and pods are also deleted by reference.

It is possible to change this behavior by setting the metadata.blockOwnerDeletion field to true. When metadata.blockOwnerDeletion is set to true, the resources are left untouched after a parent resource is removed.

Images

By default, the kubelet running on every node deletes unused images every two minutes. To configure this setting, use a kubelet config file and provide a duration value to the imageMinimumGCAge field.

To trigger image garbage collection, the kubelet takes disk usage into consideration. Use the two configurable fields HighThresholdPercent and LowThresholdPercent, to remove images based on the last time they were used. The kublet will start with the oldest images when disk space reaches the value set in HighThresholdPercent. Until the value set in LowThresholdPercent is reached, kubelet will continue to delete images.

Containers

Unused containers are cleaned up every five minutes. Control the specific behavior of a cleanup by using the flags --maximum-dead-containers, --maximum-dead-containers-per-container and --minimum-container-ttl-duration.

--maximum-dead-containers globally sets the maximum number of containers to keep before garbage collection removes, or deletes, stopped containers. At kubelet startup, this is set to -1 by default or admins can manually set it. This means there is no limit to the number of stopped containers allowed on a cluster before garbage collection is triggered.

--maximum-dead-containers-per-container sets the number of old container instances to keep per container. The default for this value is set to 1. --minimum-container-ttl-duration controls the time duration before a container is garbage collected. This value is set to 0 meaning this setting is disabled by default.

Kubernetes Jobs

After a Kubernetes Job finishes, the completed Job and the pod stick around unless other garbage collection conditions are triggered by default. For example, if the kube-controller-manager's terminated-pod-gc-threshold setting is triggered, there is a limited number of terminated Pods before garbage collection starts to delete pods. In most cases, the terminated pods will remain around for a while as the default is set at 12,500 pods.

Set the .spec.ttlSecondsAfterFinished field of the Job to control this behavior. This field determines how many seconds pass after the Job is completed before the TTL controller deletes the job. It is recommended to use this field because the alternative is the default deletion policy of orphanDependents. With orphanDependents, pods started by Jobs are orphaned after the job is complete. This could lead to performance degradation if several orphaned pods were to build up. Set a ttlSecondsAfterFinished value to ensure the pods delete after a Job is finished.

Finalizers

To indicate to Kubernetes that specific actions be taken before resources delete, create a resource with a manifest file and set the field metadata.finalizers.

The finalizer is similar to annotations. The real magic is in the controller that manages the finalizer. For example, when using a PersistentVolume, a finalizer of kubernetes.io/pv-protection is often used. This prevents the PersistentVolume from being deleted, either by an admin or an automated process that deletes the pod before the finalizer is removed. In a scenario where a pod is using a PersistentVolume that gets deleted, the resource is marked as Terminating but can't be deleted until the finalizer key kubernetes.io/pv-protection is removed. The persistent volume controller will only clear the finalizer once the Pod stops using the PersistentVolume, which lets the controller delete the PersistentVolume.

Matthew Grasberger is a DevOps engineer at Imperfect Foods. He has experience in test automation, DevOps engineering, security automation and open source mobile testing frameworks.

Dig Deeper on Containers and virtualization