geometrix - Fotolia

Tip

Manage cluster resources with Kubernetes requests and limits

Proper Kubernetes cluster management starts at the pod level. Learn about Kubernetes limits and requests, and how Resource Quotas and Limit Ranges work to control resource consumption.

A pod in Kubernetes can freely consume resources, such as CPU and memory, from the underlying system. But that doesn't mean it should.

IT teams define limits that restrict pods from using all the resources of the system. That way, there isn't any unfair distribution of resources across applications running on the pods.

To understand these configurable constraints, first let's examine requests and limits in Kubernetes. Both are applied at the pod level. The Kubernetes scheduler uses this information to determine where to place a pod.

  • Requests. A request is the minimum resources a container needs on a node to function properly. If there aren't nodes with sufficient resources to meet these requests, the pod won't be created.
  • Limits. A limit sets the maximum amount of a resource that a container can use. If a container consumes more than its limit, then it's either throttled down (in the case of CPU) or the container is terminated (in the case of memory). Limits ensure containers don't consume more resource than they're assigned, which leads to resource exhaustion.

Choosing practical requests and limits

Setting requests and limits correctly is critical to the Kubernetes cluster. But setting pragmatic Kubernetes requests and limits is not a straightforward task. It can also be unpredictable.

Use trial and error to determine the appropriate limitations. There can't be a one-size-fits-all or optimal value for all containers. Resource consumption depends on the application and varies on a case-by-case basis.

Ideally, maintain a 20% to 30% margin during request setup so that, even if slightly more resources are required than allotted, it can be managed easily. But don't overcommit resources, as it can result in performance bottlenecks on the node.

Setting requests and limits

Setting up Kubernetes requests and limits on pods is as simple as applying the pod configuration file below, with requests and limits set for CPUs in "millicpu" and for memory in "MiB."

apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
spec:
  containers:
  - name: demo-container-1
    image: nginx
    resources:
      limits:
        memory: "900Mi"
        cpu: "900m"
      requests:
        memory: "700Mi"
        cpu: "600m"

Save the above YAML to a file named config.yaml and run the following command to create this pod:

kubectl apply -f config.yaml --namespace=demo-namespace

Default requests and limits

Developers should set their own resource requests and limits. But sometimes they either over-provision resources for containers out of an abundance of caution or forget to set requests and limits altogether. Therefore, Kubernetes cluster administrators should enforce requests and limits on a Kubernetes namespace level. This ensures that, as soon as a container is created in the namespace, resource bounds are applied to them automatically.

There are two ways to control resource consumption on a Kubernetes namespace: Resource Quotas and Limit Ranges.

Resource Quotas

A ResourceQuota limits the total resource consumption of a namespace. For example, in the following YAML configuration, the namespace has a hard limit of 10 CPUs, 20 Gi of memory and 10 pods.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: small-pods-quota
spec:
  hard:
    cpu: "10"
    memory: 20Gi
    pods: "10"

Limit Ranges

LimitRange objects manage constraints at a pod/container level, and the configurations are applied on individual containers, rather than across all resources in a namespace. If a container doesn't specify its own container limit, then a "LimitRange" object will apply default requests and limits, which are restricted by the ResourceQuota at the namespace level.

LimitRange objects are defined as shown in the following example YAML configuration, where "max" and "min" sections cover the maximum and minimum limits of resources. The "default" section configures containers with these default limits, if not explicitly specified otherwise. Finally, the "defaultRequest" section controls the assignment of default request values for the container, if not defined already.

apiVersion: v1
kind: LimitRange
metadata:
  name: defaultlimits
spec:
  limits:
  - default:
      memory: 100Mi
      cpu: 100m
    defaultRequest:
      memory: 70Mi
      cpu: 70m
  max:
     memory: 300Mi
     cpu: 400m
  min:
     memory: 50Mi
     cpu: 50m
   type: Container

Next Steps

Strategies for Kubernetes multi-cluster management

Dig Deeper on Containers and virtualization