Hi, I’ve been researching how linkerd (and service meshes in general) works with k8s headless services, and have some questions. Apologies beforehand if I miss any design constraints. I’m fairly new to the whole k8s + service mesh space.
First one is about multi-cluster. According to this blog post, multi-cluster-kubernetes-with-headless-services, when routing to a headless service in a remote cluster (say R), traffic has to first go through a dedicated gateway in R. This is because the mirrored Endpoints configure the gateway’s IP as the destination. Is this a fundamental requirement? Afaik, in certain k8s setups like GKE, pod IPs are first-class and are natively routable from anywhere within the VPC. So it seems that having each mirrored Endpoint points directly to its corresponding remote pod should work, in which case we save an extra hop. Is this design possible (assuming the context)? Or maybe we then lose out on some features that require going through the gateway?
Second question is also about headless services, but within a single cluster. The linkerd doc on load balancing states this:
If working with headless services, endpoints of the service cannot be retrieved. Therefore, Linkerd will not perform load balancing and instead route only to the target IP address.
Istio has the same constraint afaik. Again, is this a fundamental limitation with service meshes and k8s? Assuming the sidecar proxy on the client pod can utilize dns for service discovery, it seems that it should be able to retrieve the IPs of all the pods backing the headless service, and then do client-side load balancing between these IPs. I saw this approach mentioned in several blog posts on how to load balance grpc services in k8s in a non-service-mesh world. In fact linkerd is doing the same thing for grpc from my understanding; it just doesn’t support headless services.
Or if this is not possible (maybe the dns step doesn’t work the way I think), can we do something similar to what the Endpoint-mirroring controller for multi-cluster does? If it can discover/mirror all headless service Endpoints from a remote cluster, can’t the same logic be used for a local service? When the doc says
endpoints of the [headless] service cannot be retrieved
what exact part of k8s makes it so?
Or if the problem is that the proxy cannot tell whether the client pod wants to load balance between target pod IPs or to route directly to one at random, is it possible to add a mechanism that lets the client announce its “intent” to linkerd?