Single linkerd mesh inside & outside Kubernetes

How reasonable is it to deploy linkerd as a connectivity layer between a set of servers that are deployed on plain old VMs and a set of services running in Kubernetes?

Context: We have an application stack that was originally developed outside of Kubernetes. We would like to be 100% inside Kubernetes eventually. We are migrating gradually, because we would prefer not to execute a single big-bang switchover. Migrating each service currently involves rather laboriously figuring out how to set up the legacy servers to talk to the Kuberneticized service or vice versa (keeping in mind that we do not want most of these to be routable from the public Internet), then delicately executing the switchover. Then you move onto the next service and do it all over.

I understand that linkerd can run both inside Kubernetes and outside of it. Naively, it seems like one way to make our jobs easier would be

  1. run linkerd daemonsets inside Kubernetes,
  2. run linkerd instances on our plain old VMs,
  3. get all linkerd instances to talk to each other, and
  4. update our services to talk to each other through linkerd,

at which point point migrating any given service to Kubernetes merely reduces to the problem of deploying it in Kubernetes. Of course steps 1-3 might not be straightforward, but at least it would be a one-time cost, at which point 4 presumably becomes simpler per-service.

Is this a sensible idea? How difficult would this be?

100% possible. This is one of our biggest use cases. Linkerd can be configured to talk to the K8s API, as well as to any service discovery mechanisms on the “legacy” side (including DNS and static host lists), in whichever precedence order makes sense. Then service connectivity is completely decoupled from service deployment.

What service discovery mechanism, if any, are you using outside of Kubernetes?

Our “service discovery” outside Kubernetes is somewhat ad hoc, but can be summarized as a mixture of

(a) DNS names and
(b) IP addresses that we dump into etcd in a dumb proprietary JSON format.

The stuff that does (b) is pretty simple and it would be straightforward to tell it to write those addresses somewhere else.

Ok. For K8s you can use the K8s namer, for (a) you can use the DNS namer, and for (b) you can either write a custom etcd namer plugin (hard), or you can write it to files on disk and use the “fs” namer (easy). If this data is not moving rapidly, I would probably recommend the fs namer approach. (There are some caveats to production use, but for slow-moving data it should be fine.)

Using those three namers, you can configure Linkerd’s routing policy (“dtab rules”) to look up a service name in K8s first, then in DNS, and finally in the file. (Or reverse the last order of the last two if you want.) There are some details around how this will work depending on what your DNS looks like, but that’s the basic idea.

You can run Linkerd in Kubernetes as a daemonset, and deploy it outside of Kubernetes however you like (ideally one per host, but other options are possible).

Finally, you’ll also need to set up your layer 3 network so that communication is possible. The simplest approach is to have pods are addressable from outside the cluster, i.e. a Linkerd instance outside the K8s cluster can proxy directly to the pod IP address returned by the K8s API. Other options are also possible if that’s not feasible.

Once all that is in place, you should be set up so that services are able to send requests to e.g. http://foo without needing to know anything about where the foo service lives. And when you migrate foo onto Kubernetes (or off of it!) you shouldn’t have to change anything further.

Hope that makes sense. That’s the basic outline. We can help you with details as you dig further into this.

Still can be possible to connect vms and kubernetes? can you please explain, what to do? i don’t understand it right