How to auto scale on Kubernetes (GKE) with a pod that runs one per node and uses all available resources?

Jacob Published at Dev

jacob

I think I have a pretty simple scenario: I need to auto-scale on Google Kubernetes Engine with a pod that runs one per node and uses all available remaining resources on the node.

"Remaining" resources means that there are certain basic pod services running on each node such logging and metrics, which need their requested resources. But everything left should go to this particular pod, which is in fact the main web service for my cluster.

Also, these remaining resources should be available when the pod's container starts up, rather than through vertical autoscaling with pod restarts. The reason is that the container has certain constraints that make restarts sort of expensive: heavy disk caching, and issues with licensing of some 3rd party software I use. So although certainly the container/pod is restartable, I'd like to avoid except for rolling updates.

The cluster should scale nodes when CPU utilization gets too high (say, 70%). And I don't mean requested CPU utilization of a node's pods, but rather the actual utilization, which is mainly determined by the web service's load.

How should I configure the cluster for this scenario? I've seen there's cluster auto scaling, vertical pod autoscaling, and horizontal pod autoscaling. There's also Deployment vs DaemonSet, although it does not seem that DaemonSet is designed for pods that need to scale. So I think Deployment may be necessary, but in a way that limits one web service pod per node (pod anti affinity??).

How do I put all this together?

Aleksi

You could set up a Deployment with a resource request that equals a single node's allocatable resources (i.e., total resources minus auxiliary services as you mentioned). Then configure Horizontal Pod Autoscaling to scale up your deployment when CPU request utilization goes above 70%; this should do the trick as in this case request utilization rate is essentially the same as total node resource utilization rate, right? However if you do want to base scaling on actual node CPU utilization, there's always scaling by external metrics.

Technically the Deployment's resource request doesn't have to exactly equal remaining resources; rather it's enough for the request to be large enough to prevent two pods being ran on the same node. As long as that's the case and there's no resource limits, the pod ends up consuming all the available node resources.

Finally configure cluster autoscaling on your GKE node pool and we should be good to go. Vertical Pod Autoscaling doesn't really come into play here as pod resource request stays constant, and DaemonSets aren't applicable as they're not scalable via HPA as mentioned.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-8

Comments

0 comments

TOP Ranking

Article

How to auto scale on Kubernetes (GKE) with a pod that runs one per node and uses all available resources?

How to auto scale on Kubernetes (GKE) with a pod that runs one per node and uses all available resources?

pump.io port in URL

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Double spacing in rmarkdown pdf

SQL Server : need add a dot before two last character

C++ 16 bit grayscale gradient image from 2D array

JMeter: Why get error when try to save test plan

JWT gives JsonWebTokenError "invalid token"

How to make thrown errors visible outside of a Promise?

How to tell if iOS Today Widget is being updated in the background?

Calling Doctrine clear() with an argument is deprecated

Capybara Selenium Chrome opens About Google Chrome

How to update azerothcore-wotlk docker container

Adding Ripple Effect to RecyclerView item

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on '127.0.0.1:3306' (111 Connection refused)

Error while applying filter on dataframe - PySpark

Unable to add slack to bluemix project

MyPy fails dataclass argument with optional list of objects type

How can I validate and parse phone numbers to extract their country calling code and area code?

Single Sign-On in Spring by using SAML Extension and Shibboleth

python how to create many-to-many of lists inside one list