How to auto scale on Kubernetes (GKE) with a pod that runs one per node and uses all available resources?

jacob

I think I have a pretty simple scenario: I need to auto-scale on Google Kubernetes Engine with a pod that runs one per node and uses all available remaining resources on the node.

"Remaining" resources means that there are certain basic pod services running on each node such logging and metrics, which need their requested resources. But everything left should go to this particular pod, which is in fact the main web service for my cluster.

Also, these remaining resources should be available when the pod's container starts up, rather than through vertical autoscaling with pod restarts. The reason is that the container has certain constraints that make restarts sort of expensive: heavy disk caching, and issues with licensing of some 3rd party software I use. So although certainly the container/pod is restartable, I'd like to avoid except for rolling updates.

The cluster should scale nodes when CPU utilization gets too high (say, 70%). And I don't mean requested CPU utilization of a node's pods, but rather the actual utilization, which is mainly determined by the web service's load.

How should I configure the cluster for this scenario? I've seen there's cluster auto scaling, vertical pod autoscaling, and horizontal pod autoscaling. There's also Deployment vs DaemonSet, although it does not seem that DaemonSet is designed for pods that need to scale. So I think Deployment may be necessary, but in a way that limits one web service pod per node (pod anti affinity??).

How do I put all this together?

Aleksi

You could set up a Deployment with a resource request that equals a single node's allocatable resources (i.e., total resources minus auxiliary services as you mentioned). Then configure Horizontal Pod Autoscaling to scale up your deployment when CPU request utilization goes above 70%; this should do the trick as in this case request utilization rate is essentially the same as total node resource utilization rate, right? However if you do want to base scaling on actual node CPU utilization, there's always scaling by external metrics.

Technically the Deployment's resource request doesn't have to exactly equal remaining resources; rather it's enough for the request to be large enough to prevent two pods being ran on the same node. As long as that's the case and there's no resource limits, the pod ends up consuming all the available node resources.

Finally configure cluster autoscaling on your GKE node pool and we should be good to go. Vertical Pod Autoscaling doesn't really come into play here as pod resource request stays constant, and DaemonSets aren't applicable as they're not scalable via HPA as mentioned.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Kubernetes - How to create one Pod per Node?

Kubernetes Horizontal Pod Autoscaler not utilising node resources

Kubernetes - One request per container inside a pod

Kubernetes: What pod uses most CPU on a node?

Limit one connection per pod using GKE Ingress

How to move a pod from one node to another in kubernetes

Kubernetes : How to ensure one pod gets scheduled on each worker node?

Why my GKE node pool does not auto-scale down?

GKE: Kubernetes Master/kubectl unresponsive during node scale

How to delete all resources from Kubernetes one time?

GKE mismatch limit of Kubernetes pods per node from official documentation

How to get the openshift node where the pod runs?

Does kubernetes create a pod on one node by default?

How to find available resources in a Kubernetes Cluster level?

How to reboot a kubernetes node (vm instance) in GKE

How to auto scale Kubernetes worker nodes on AWS

How to set auto-scaling for kubernetes (both in same node and scale out) by using Azure Pipelines?

Node size on GKE kubernetes

How to schedule pod between two node pool in GKE cluster

How online radio live stream music and are there available resources to build one with Node.js?

How to specify pod to node affinity in kubernetes

How to find all Kubernetes Pods on the same node from a Pod using the official Python client?

How to get all Pod's IP on a kubernetes

Dask Distributed - how to run one task per worker, making that task running on all cores available into the worker?

GKE Node Upgrade "Out of Resources"

Kubernetes - Do master resources have to scale in the same way than worker node resources?

Kubernetes weave how to create pod with network that uses bridge as network?

Kubernetes service per pod metacontroller

How to limit allocatable memory per node on Kubernetes?