0

i am running k8s cluster on GKE

it has 4 node pool with different configuration

Node pool : 1 (Single node coroned status)

Running Redis & RabbitMQ

Node pool : 2 (Single node coroned status)

Running Monitoring & Prometheus

Node pool : 3 (Big large single node)

Application pods

Node pool : 4 (Single node with auto-scaling enabled)

Application pods

currently, i am running single replicas for each service on GKE

however 3 replicas of the main service which mostly manages everything.

when scaling this main service with HPA sometime seen the issue of Node getting crashed or kubelet frequent restart PODs goes to Unkown state.

How to handle this scenario ? If the node gets crashed GKE taking time to auto repair and which cause service down time.

Question : 2

Node pool : 3 -4 running application PODs. Inside the application, there are 3-4 memory-intensive micro services i am also thinking same to use Node selector and fix it on one Node.

while only small node pool will run main service which has HPA and node auto scaling auto work for that node pool.

however i feel like it's not best way to it with Node selector.

it's always best to run more than one replicas of each service but currently, we are running single replicas only of each service so please suggest considering that part.

2
  • 2
    if you have a single node, you leave yourself with a single point of failure. Also keep in mind that autoscaling takes time to kick in and is based on resource requests. If your node suffers OOM because of memory intensive workloads, you need to readjust your memory requests and limits Commented Oct 10, 2020 at 0:50
  • thankyou so much for your reply and suggestion. i will look into to for sure to readjust the requests and limits Commented Oct 10, 2020 at 18:52

1 Answer 1

1

As Patrick W rightly suggested in his comment:

if you have a single node, you leave yourself with a single point of failure. Also keep in mind that autoscaling takes time to kick in and is based on resource requests. If your node suffers OOM because of memory intensive workloads, you need to readjust your memory requests and limits – Patrick W Oct 10 at

you may need to redesign a bit your infrastructure so you have more than a single node in every nodepool as well as readjust mamory requests and limits

You may want to take a look at the following sections in the official kubernetes docs and Google Cloud blog:

How to handle this scenario ? If the node gets crashed GKE taking time to auto repair and which cause service down time.

That's why having more than just one node for a single node pool can be much better option. It greatly reduces the likelihood that you'll end up in the situation described above. GKE autorapair feature needs to take its time (usually a few minutes) and if this is your only node, you cannot do much about it and need to accept possible downtimes.

Node pool : 3 -4 running application PODs. Inside the application, there are 3-4 memory-intensive micro services i am also thinking same to use Node selector and fix it on one Node.

while only small node pool will run main service which has HPA and node auto scaling auto work for that node pool.

however i feel like it's not best way to it with Node selector.

You may also take a loot at node affinity and anti-affinity as well as taints and tolerations

Sign up to request clarification or add additional context in comments.

3 Comments

i have question if i have stateful set running in single pod what if that node goes down ?
If you have additional node, Statefulset will try to deploy such Pod on the second node. Statefulset manages your Pods the same way as Deployment. If you even delete such Pod by yourself, this higher abstraction objects will ensure that the Pod is up and running, of course if there is a node, on which it can be scheduled.
Great got it thank a lot for answer now i am much clear on part.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.