In the current deployment, there are a few pods running and waiting for gRPC calls from various clients. Linkerd is used as a service mesh to load balance the gRPC requests coming from the clients, and it forwards new requests to unused pods which works great so far.
The problem now is when the number of requests exceeds the number of pods in a deployment, for some reason all the requests get mixed with the active pods and all the pods end up processing the last request.
So my question is how to upscale the number of pods based on new incoming gRPC requests, for example 2 pods are currently processing 2 unique requests and a 3rd request has just arrived so I would like to spawn a new pod for it to process it. How is that possible?
@edpell There’s a good example of autoscaling on latency here and it should be possible to modify the autoscaling rules with a custom query to get the request rate from Linkerd metrics.
linkerd viz stat command, you get output similar to the following, which includes the request rate for traffic:
NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
web 1/1 91.55% 2.4rps 4ms 9ms 10ms 4
That value is derived from the success and failure rates over a given time window.
So, you could write a prometheus query using the
response_total metric and the
classification label to get the RPS for your custom rule. All of the available prometheus metrics are available from a given pod using the
linkerd dg proxy-metrics <podname> command.
Hope this gets you pointed in the right direction!
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.