"per service" circuit breaking (ELB based) external configuration


What I am trying to do?

  • My goal is to set successRate type failure accrual to endpoint-1 and consecutiveFailure(or none) type to endpoint-2





set -m # Enable Job Control

for i in seq 50; do # start 50 jobs in parallel
http_proxy=$(kubectl get svc l5d -o jsonpath="{.status.loadBalancer.ingress[0].*}"):4140 curl http://s3-ec2-xxxxxxx.us-east-1.elb.amazonaws.com &

The endpoint status remains unaltered after making many requests thus the failure accrual settings are not in effect.

Can you help me with what might be wrong with my configuration?


Hi @zshaik! The first thing I want to note is that circuit breaking is designed to help improve success rate by preferring healthy endpoints over unhealthy ones. In the case where there is only one endpoint (the ELB in this case) circuit breaking isn’t able to help because there are no other endpoints to choose from. In particular, circuit breaking does not guarantee that traffic will be shut off to unhealthy endpoints. I wrote about this in a bit more detail here: Unable to prevent requests reaching the endpoint(internal) after circuit breaking is in action

With that said, there are a few things we can check in your configuration to make sure circuit breaking is configured and working correctly.

The first thing to take a look at is whether the circuit breaker has activated. We can see this in the admin dashboard:

Endpoints available: 0/1 means that the 1 endpoint has been marked unhealthy and is not “available”. Note that even though this endpoint is marked as “unavailable”, linkerd will still send traffic to it because there are no other endpoints to choose from.

The next thing we can look at is if the client has been configured with failure accrual properly. Most of the time this is obvious from the configs, but with per-client configuration it can be a bit tricky. Linkerd actually exposes very detailed information about how each individual client is configured through it’s client registry, which you can browse at /admin/registry.json.

In the above screenshot we can see that the /$/io.buoyant.rinet/8888/localhost client is configured with a SuccessRate` failure accrual policy.

I hope this helps!


Thanks Alex! @Alex :racehorse::thumbsup: