Linkerd proxy failing to forward outgoing traffic to a headless service

Here are the proxy logs

[ 3200.246474462s] INFO outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}:logical{addr=10-240-5-95.my-service.my-namespace.svc.cluster.local:8001}:profile:balance{addr=10-240-5-95.my-service.my-namespace.svc.cluster.local:8001}: linkerd2_proxy_api_resolve::resolve: No endpoints
[ 3203.241936032s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: request timed out
[ 3203.298673231s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
[ 3203.402355728s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
[ 3203.605594976s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
[ 3203.910474000s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
[ 3204.312820653s] WARN outbound:accept{peer.addr=10.240.5.148:48840}:source{target.addr=10.240.5.95:8001}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast

Validated the following:

  • ‘Linkerd check’ looks great
  • All linkerd pods are running fine with no errors
  • My application pod is running fine
  • No issues with ‘kubectl api-resources’
  • K8s version 1.14 and Linkerd version 2.7.1

This is a k8s cluster based of azure kubernetes. Appreciate any further pointers to debug this.

This is my setup without linkerd:

gRPC client -> Custom Loadbalancer client sidecar ----> Custom Loadbalancer server sidecar-> backend gRPC pods

My setup with linkerd:

gRPC client -> Custom Loadbalancer client sidecar -> Linkerd proxy ----> Linkerd proxy -> Custom Loadbalancer server sidecar -> backend gRPC pods

The logs above are from the Linkerd proxy running with the gRPC client. If I take this proxy out, my connections seem fine.

@vijaygos can you describe the custom LoadBalancer server sidecar? Does it run in the same pod as Linkerd? Or have you only added the Linkerd proxy to the ingress controller?

Which ingress controller are you using?

I’m curious to know what it does and how it uses iptables and networking. Is it possible to try without the sidecar? Linkerd is going to do load balancing as well, so I suspect that there is some collision or contention between the custom LoadBalancer server sidecar and Linkerd.

Hi @cpretzer,
Forgive my ineptness at explaining myself correctly. The ingress controller was a “bad copy paste” error. I fixed the scenario now. The linkerd proxies are meant to talk to one another directly. There is absolutely NO ingress controller involved.
The linkerd proxy and the load balancer run on the same pod. The load balancer is meant to support some scenarios for us that are long running and it is designed to talk directly to the Headless service pods using the pod host names or in other words there is custom logic baked into the load balancer to choose and pick the “right” pod.

@vijaygos Thanks for the clarifying, no forgiveness is necessary :slight_smile:

One thing that might help is to set the proxy log level to TRACE so that we can see some details about the failing requests.

Does the custom load balancer emit any logs that might help to understand the route of one of these failing requests? I’m pretty sure that this is a load balancing conflict between the load balancing that Linkerd does and the logic in the custom load balancer.