Debugging Linkerd to Consul and Service Connections


#1

I’m trying to do a PoC for a dev environment where I have Consul and Linkerd running as containers on a Windows host and then starting a .NET Core web service. The goal is to have the service self-register with Consul and then be able to use Linkerd as the endpoint to talk to the service.

I’ve created my own network in Docker and placed both containers within it. Running a shell from the linkerd container I can run

ping consul
or
curl http://consul/v1/status/leader

and get responses. This seems that I can talk from the linkerd container to the consul service. Also, I can ping and access my service’s endpoint:

ping 192.168.1.xxx
or
curl http://191.168.1.xxx:5000/api/values

So it would also appear that linkerd can talk to the service.

I’ve taken the yaml file from the docker example and modified it to fit my needs:

admin:
  ip: 0.0.0.0
  port: 9990

namers:
- kind: io.l5d.consul
  includeTag: false
  useHealthCheck: false
  host: consul
#  host: 172.19.0.3

routers:
- protocol: http
  label: /http-consul
  service:
    responseClassifier:
      kind: io.l5d.http.retryableIdempotent5XX
  identifier:
   kind: io.l5d.path
   segments: 1
   consume: true
  dtab: |
    /svc => /#/io.l5d.consul/dc1;
  servers:
  - port: 4140
    ip: 0.0.0.0
  client:
   requeueBudget:
     percentCanRetry: 5.0

usage:
  orgId: linkerd-examples-consul

As you can see, I’ve tried using the host name for the consul server and the IP on the server.

So, I start my service and it self-registers in Consul. I can see this through the UI and querying consul’s REST API.

[
    {
        "ID": "acda76f5-cb84-7267-5fca-44fbed0948fd",
        "Node": "consul.test.com",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "NodeMeta": {
            "consul-network-segment": ""
        },
        "ServiceKind": "",
        "ServiceID": "background-forms-api-v1-final-01",
        "ServiceName": "background-forms-api",
        "ServiceTags": [
            "Background Check",
            "CBES",
            "Forms"
        ],
        "ServiceAddress": "http://192.168.1.203",
        "ServiceWeights": {
            "Passing": 1,
            "Warning": 1
        },
        "ServiceMeta": {},
        "ServicePort": 5000,
        "ServiceEnableTagOverride": false,
        "ServiceProxyDestination": "",
        "ServiceConnect": {
            "Native": false,
            "Proxy": null
        },
        "CreateIndex": 2718,
        "ModifyIndex": 2744
    }
]

When I run curl http://localhost:4140/background-forms-api/api/values I get:

Unable to route request!

service name: /svc/background-forms-api
resolutions considered:
  /#/io.l5d.consul/dc1/background-forms-api (neg)
dtab:

base dtab:
  /svc => /#/io.l5d.consul/dc1
override dtab:

I have the logs set to debug and won’t post the whole 500+ lines of startup. The only line that references consul that isn’t in startup is:

2018-10-02T13:24:59.869378900Z D 1002 13:24:59.868 UTC THREAD28 TraceId:399727fb73e5d28a: consul lookup: dc1 /#/io.l5d.consul/dc1/background-forms-api

Nothing else.

Where I’m having trouble troubleshooting this on my own is getting at what linkerd is doing and where it is failing. Is it unable to query consul for service status? Is it unable to get to the client? I don’t have a clear understanding of what “Unable to route to host” means. Does it mean that it can’t find a correct route to be generated for the request or that the network route is down?

Thanks


#2

I think I may have found the issue.

When the service auto-registers it was sending its scheme and host for the ServiceAddress. This value was ending up to be http://192.168.1.203. In the Consul UI this appears to be correct as it shows the full base URL: http://192.168.1.203:5000 with the port attached.

I found this by looking at the service registration for the HelloWorld service in the docker example. The example uses an older version of consul so I’m going to try now with the change to my service registration code to make sure it still works.

I would still like to know how I would have been able to troubleshoot this better, though. I feel like I just stumbled on this answer instead of being indicated that it was wrong by linkerd.


#3

Hi @mikejr83,

A couple tools to aide in debugging routing issues:

  1. Linkerd’s DTAB playground, in the admin UI, provides a way to validate routing in your current config:
    https://linkerd.io/1/administration/dtab-playground/

  2. Linkerd also provides an administrative endpoint that returns the state of the io.l5d.consul namer, to validate it has correct service discovery information. That endpoint is at [linkerd]:9990/namer_state/io.l5d.consul.json, more info:
    https://api.linkerd.io/1.5.0/linkerd/index.html#consul-configuration

Let us know how it goes.

Cheers,
sig


#4

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.