1) Check that the service you’re trying to hit is running
Always good to double check this first! Make a request to the service directly
to make sure it’s healthy.
2) Check if linkerd is running
Note, linkerd will exit on startup if it
doesn’t understand the config (or if no config is specified).
curl -v $LINKERD_URL:9990/admin/ping
3) Check the linkerd logs
In docker this would be: docker logs linkerd
If there are no error logs, you can try turning up the logging at
$LINKERD_URL:9990/logging
and see if that shows additional info. Or restart
linkerd with the -log.level=DEBUG
flag. Here are some common errors that
appear in logs. If there are still no error logs…
4) Check for incoming requests
Open $LINKERD_URL:9990
in a browser and confirm that linkerd is receiving the
same number of requests that are being sent.
5) Check metrics
The metrics payload includes counters for failures and exceptions, which you can
list with
curl $LINKERD_URL:9990/admin/metrics.json?pretty=1 | jq 'with_entries(select(.key | contains("failure"), contains("exception")))'
Metrics definitions are listed in detail as part of the finagle
guide
Interpreting Linkerd Logs
Exception thrown in main on startup
Usually there will be a jackson error like
com.fasterxml.jackson.databind.JsonMappingException
right above to clue you in
on what part of the config was invalid.
No hosts are available for {path}, Dtab.base=[{dtab}], Dtab.local=[]
This most commonly appears when using a misconstructed dtab. To diagnose
further:
- visit
$LINKERD_URL:9990/delegator
in the browser. - make sure the router in question is selected from the dropdown in the upper right corner.
- make sure the dtab you expect the router to use is displayed!
- copy the path from the error message into the path text field and click
Go!
You should see a tree of prefix matches, with the paths in red showing either
paths with no further prefix matches available or an exception from the matched
namer. Namer exceptions can occur when the entry does not exist in service
discovery or the namer cannot connect to service discovery.
com.twitter.finagle.ChannelClosedException
The channel was closed, for instance if the connection was reset by a peer or a
proxy. This is a relatively common exception to see in linkerd logs, and
generally means a server linkerd is talking to is suffering from some kind of
uncaught exception, panic, or error. Seeing ChannelClosedException
suggests
you should check your application logs for potential problems.
com.twitter.finagle.CancelledRequestException
The request was cancelled by an upstream client. This could be due to aggressive
timeouts, or the user simply hitting ctrl-C.
com.twitter.finagle.FailedFastException
If (after multiple tries) linkerd can’t establish a connection to a server,
linkerd temporarily removes it from the load balancing pool. Typically this
happens when the server is down or the server did not get registered
in service discovery.
Or if all that is fine, linkerd just might be too aggressive in kicking things
out of the pool. You can configure failureAccrual to be more lenient, or you can
mark certain classes of errors as retryable. See our circuit breaking blog post
for more information.
Kubernetes environment gotchas
Network configurations
Running Minikube/CNI/Calico/Weave? See our Flavors of Kubernetes page on getting the default examples to work.
Failed to resolve namerd.default.svc.cluster.local
While using the namerd interpreter, linkerd was unable to reach namerd via
cluster dns. This happens if the linkerd instance is running as a daemonset with
hostNetwork: true
. In these cases you will need to use namerd’s nodePort
in
the linkerd config.
io.buoyant.k8s.Api$NotFound
Linkerd was unable to reach the k8s dtab store. Often this is because the ThirdPartyResource is not setup on the cluster.
CI Failures
Linkerd’s CI Config runs on openjdk:8
. To reproduce CI failures in your local environment, do:
docker run -it openjdk:8 /bin/bash
git clone https://github.com/linkerd/linkerd.git
cd linkerd
# checkout your branch, run tests