How many services can linkerd realistically handle?

If I organized my application into, say, hundreds of thousands or millions of services, could linkerd keep up?

I’m picturing implementing a database service with thousands of users, each of which has hundreds of database shards, and I want to treat each shard as a separate service, and use linkerd + namerd to keep track of where all the shards are and route the traffic. Is that realistic? Is it a good idea?

Hiya!

Performance is based really heavily on linkerd’s configuration and deployment (sidecar or host deployment, jvm memory and runtime params, number of client connections linkerd keeps open, percentage of traffic being traced, & plenty other stuff I’m not thinking of at the moment), so it’s tough to say w/o taking an intimate look at what architecture you’re planning. With the right setup, a linkerd deployment can easily handle millions of service instances.

I don’t hear a lot of people using linkerd specifically for database access, though. Mostly because it doesn’t yet support stateful protocols like mysql etc.

1 Like

Nice, thanks for the reply.

I guess I’m thinking more generally about sharding, and how much it has in common with what linkerd does. If I’ve decided to split someone’s data into ten shards, for example, and I want to have three copies of each for redundancy, then I’d like to have service discovery, load balancing, and fault tolerance for each shard, not to mention logging, telemetry, and all the other goodies you get from linkerd.

Maybe what I need to make is a custom namer plug-in that supplies the logic to keep track of the shards? I’d love to find some way to get linkerd into the picture here.

@prdoyle if you can express the shard id as part of the Host value used, it will effectively be treated as a separate service by linkerd, and you’ll get that behavior per shard. If the shard id needs to be extracted from the payload itself, a custom namer might work but I’m a bit unsure—we typically don’t inspect the payload for routing decisions, only the metadata.

How much control do you have over this system? Is the first approach workable?

@william - Yeah, I don’t think we need to go into the payload; just the headers. Thanks!

Great. Try splitting the shards by Host header (or there are other variations of this idea if you want to use part of the URL path) and I think it should Just Work ™. Let us know how it goes.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.