In the current deployment, there are a few pods running and waiting for gRPC calls from various clients. Linkerd is used as a service mesh to load balance the gRPC requests coming from the clients, and it forwards new requests to unused pods which works great so far.
The problem now is when the number of requests exceeds the number of pods in a deployment, for some reason all the requests get mixed with the active pods and all the pods end up processing the last request.
So my question is how to upscale the number of pods based on new incoming gRPC requests, for example 2 pods are currently processing 2 unique requests and a 3rd request has just arrived so I would like to spawn a new pod for it to process it. How is that possible?