We are running Namerd & Consul cluster in our enviornment ,getting below error frquently and after rebooting Namerd services to matigate this issue.
Namerd-01 namerd: E 0114 06:30:29.851 UTC THREAD31: Retrying Consul request ‘GET /v1/health/service/xxxx?dc=x&passing=true’ on NonFatal error: com.twitter.finagle.ChannelWriteException: com.twitter.finagle.ChannelClosedException: null at remote address: consul.service.x.x.internal.prod/x.x.x.x:8500. Remote Info: Not Available from service: client. Remote Info: Upstream Address: Not Available, Upstream id: Not Available, Downstream Address: consul.x.x.x.internal/x.x.x.x:8500, Downstream label: client, Trace Id: 6d298c434b2d50b7.6d298c434b2d50b7<:6d298c434b2d50b7
Please help me. if any one facing this issue ,using below configuration.
/opt/namerd/namerd.yaml
admin:
port: 9991
ip: 0.0.0.0
namers:
- kind: io.l5d.consul
host: consul.service.x.x.internal.prod
port: 8500
useHealthCheck: true
storage:
kind: io.l5d.consul
host: consul.service.x.x.internal.prod
port: 8500
pathPrefix: /namerd/dtabs
datacenter: ent
interfaces:
-
kind: io.l5d.thriftNameInterpreter
ip: 0.0.0.0
port: 5100
cache:
bindingCacheActive: 2000
bindingCacheInactive: 200
addrCacheActive: 2000
addrCacheInactive: 200
-
kind: io.l5d.httpController
ip: 0.0.0.0
port: 5180
telemetry:
=============================================
Did something change in the environment recently? How long have you been running Linkerd?
Did you check the list of endpoints known to namerd using the admin interface?
Hi,
We are running Linkerd & Namerd since 2016 & nothing changed environment wise. below is named.conf file
admin:
port: 9991
ip: 0.0.0.0
namers:
- kind: io.l5d.consul
host: consul.service.x.x.internal.x
port: 8500
useHealthCheck: true
storage:
kind: io.l5d.consul
host: consul.service.x.x.internal.x
port: 8500
pathPrefix: /namerd/dtabs
datacenter: ent
interfaces:
-
kind: io.l5d.thriftNameInterpreter
ip: 0.0.0.0
port: 5100
cache:
bindingCacheActive: 2000
bindingCacheInactive: 200
addrCacheActive: 2000
addrCacheInactive: 200
-
kind: io.l5d.httpController
ip: 0.0.0.0
port: 5180
telemetry:
@Rajredison I can only guess that namerd is getting some stale endpoints. You can use the namerd and consul APIs to make sure that the endpoints match between the two, if it happens again. Have a look at this issue to see if it describes similar behavior to what you’re seeing.