Namerd: etcd 400 bad request on GET /v2/keys/namerd/dtabs?recursive=true

Hi Guys

I’m trying to set up namerd to use etcd as a storage backend on kubernetes.

I have a healthy etcd cluster running in kubernetes, I have validated this by entering the cluster and ensuring I can use etcdctl.

❯ kubectl --namespace namerd exec -it etcd-0000 sh
/ # etcdctl cluster-health
member 360cc9bb22f2cb0f is healthy: got healthy result from http://etcd-0001.etcd.namerd.svc:2379
member 398a981510b1fa9c is healthy: got healthy result from http://etcd-0000.etcd.namerd.svc:2379
member a0a48f345c6d0b1a is healthy: got healthy result from http://etcd-0002.etcd.namerd.svc:2379
cluster is healthy

I then opened a port forward to my namerd cluster and created some basic name spaces:

kubectl port-forward $(kubectl get pods -n namerd -l app=namerd -o jsonpath='{ .items[0].metadata.name }') -n namerd 4180:4180
❯ echo "
    /srv=>/#/io.l5d.k8s/dev/grpc;
    /svc=>/srv
" | namerctl dtab create dev-in -
echo "
    /srv=>/#/io.l5d.k8s/dev/grpc;
    /svc=>/srv
" | namerctl dtab create dev-out -
echo "
    /srv=>/#/io.l5d.k8s/qa/grpc;
    /svc=>/srv
" | namerctl dtab create qa-in -
echo "
    /srv=>/#/io.l5d.k8s/qa/grpc;
    /svc=>/srv
" | namerctl dtab create qa-out -
echo "
    /srv=>/#/io.l5d.k8s/prod/grpc;
    /svc=>/srv
" | namerctl dtab create prod-in -
echo "
    /srv=>/#/io.l5d.k8s/prod/grpc;
    /svc=>/srv
" | namerctl dtab create prod-out -
Created dev-in
Created dev-out
Created qa-in
Created qa-out
Created prod-in
Created prod-out

I then validated that these name spaces were indeed created in etcd:

❯ kubectl --namespace namerd exec -it etcd-0000 sh     
/ # etcdctl ls / -r
/namerd
/namerd/dtabs
/namerd/dtabs/prod-in
/namerd/dtabs/prod-out
/namerd/dtabs/dev-in
/namerd/dtabs/dev-out
/namerd/dtabs/qa-in
/namerd/dtabs/qa-out

As you can see these were all created and so namerd can talk to etcd.

However namerctl dtab list returns a 500:

❯ namerctl dtab list
Error: unexpected response: 500 Internal Server Error

Looking at logs from namerd I can across this stack trace:

io.buoyant.etcd.UnexpectedResponse: GET /v2/keys/namerd/dtabs?recursive=true [(recursive -> true)] Status(400)
	at io.buoyant.etcd.NodeOp$.$anonfun$mk$4(NodeOp.scala:71)
	at com.twitter.util.Try.transform(Try.scala:222)
	at io.buoyant.etcd.NodeOp$.mk(NodeOp.scala:69)
	at io.buoyant.etcd.Key.$anonfun$get$2(Key.scala:83)
	at com.twitter.util.Future.$anonfun$flatMap$1(Future.scala:1740)
	at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:228)
	at com.twitter.util.Promise$Transformer.k(Promise.scala:228)
	at com.twitter.util.Promise$Transformer.apply(Promise.scala:239)
	at com.twitter.util.Promise$Transformer.apply(Promise.scala:220)
	at com.twitter.util.Promise$$anon$7.run(Promise.scala:532)
	at com.twitter.concurrent.LocalScheduler$Activation.run(Scheduler.scala:198)
	at com.twitter.concurrent.LocalScheduler$Activation.submit(Scheduler.scala:157)
	at com.twitter.concurrent.LocalScheduler.submit(Scheduler.scala:274)
	at com.twitter.concurrent.Scheduler$.submit(Scheduler.scala:109)
	at com.twitter.util.Promise.runq(Promise.scala:522)
	at com.twitter.util.Promise.updateIfEmpty(Promise.scala:887)
	at com.twitter.util.Promise.update(Promise.scala:859)
	at com.twitter.util.Promise.setValue(Promise.scala:835)
	at com.twitter.concurrent.AsyncQueue.offer(AsyncQueue.scala:122)
	at com.twitter.finagle.netty4.transport.ChannelTransport$$anon$1.channelRead(ChannelTransport.scala:183)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at com.twitter.finagle.netty4.http.handler.UnpoolHttpHandler$.channelRead(UnpoolHttpHandler.scala:32)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:371)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:281)
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1337)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:916)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:744)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at com.twitter.finagle.util.BlockingTimeTrackingThreadFactory$$anon$1.run(BlockingTimeTrackingThreadFactory.scala:23)
	at java.lang.Thread.run(Thread.java:748)

I can see the error is to with the GET request too /v2/keys/namerd/dtabs?recursive=true so entered the namerd cluster and attempted to make the request myself:

❯ kubectl --namespace namerd exec -it namerd-7rnx1 bash
root@namerd-7rnx1:/io.buoyant/namerd/1.3.1# curl http://etcd.namerd:2379/v2/keys/namerd/dtabs?recursive=true
{"action":"get","node":{"key":"/namerd/dtabs","dir":true,"nodes":[{"key":"/namerd/dtabs/dev-out","value":"/srv=\u003e/#/io.l5d.k8s/dev/grpc;/svc=\u003e/srv","modifiedIndex":10,"createdIndex":10},{"key":"/namerd/dtabs/qa-in","value":"/srv=\u003e/#/io.l5d.k8s/qa/grpc;/svc=\u003e/srv","modifiedIndex":11,"createdIndex":11},{"key":"/namerd/dtabs/qa-out","value":"/srv=\u003e/#/io.l5d.k8s/qa/grpc;/svc=\u003e/srv","modifiedIndex":12,"createdIndex":12},{"key":"/namerd/dtabs/prod-in","value":"/srv=\u003e/#/io.l5d.k8s/prod/grpc;/svc=\u003e/srv","modifiedIndex":13,"createdIndex":13},{"key":"/namerd/dtabs/prod-out","value":"/srv=\u003e/#/io.l5d.k8s/prod/grpc;/svc=\u003e/srv","modifiedIndex":14,"createdIndex":14},{"key":"/namerd/dtabs/dev-in","value":"/srv=\u003e/#/io.l5d.k8s/dev/grpc;/svc=\u003e/srv","modifiedIndex":9,"createdIndex":9}],"modifiedIndex":8,"createdIndex":8}}

As you can see this returned a 200 OK and valid json.

I am confused as to why etcd would throw a 400 error for the GET request above since it works. I can only conclude that namerd sets some headers or does something else the request I can’t see in the stack trace.

I tried looking at the source code but I’m not familiar with scala, I got as far as https://github.com/linkerd/linkerd/blob/master/etcd/src/main/scala/io/buoyant/etcd/Key.scala#L70-L84 but this dosn’t appear to add an extra special headers, neither does https://github.com/linkerd/linkerd/blob/master/etcd/src/main/scala/io/buoyant/etcd/Etcd.scala#L31-L50.

So I am at a loss.

The versions of etcd deployed:

curl http://etcd.namerd:2379/version
{"etcdserver":"3.2.9","etcdcluster":"3.2.0"}

Have I missed something in the config? Should I be running an earlier version on etcd?

Here is my namerd config:

# Admin Port
admin:
  ip: 0.0.0.0
  port: 9991
# namers are used to name service discovery backends to resolve to
# bound addresses
namers:
- kind: io.l5d.k8s    # Use kubernetes API for service discovery
  experimental: true  # This is an experimental feature, but it does work ;)
  host: localhost     # What host is the API on
  port: 8001          # What port is the API on
# storage is where data for namerd will be stored, such as the dtabs and cache
storage:
  kind: io.l5d.etcd
  experimental: true
  host: etcd.namerd
  port: 2379
  pathPrefix: /namerd/dtabs
# interfaces define how namerd is exposed to other services, such as linkerd
interfaces:
  # Thrift is used for linkerd communication - Read Only
- kind: io.l5d.thriftNameInterpreter
  ip: 0.0.0.0 # IP to bind too
  port: 4100  # port to bind too
  # HTTP controler for storage - read / write
- kind: io.l5d.httpController
  ip: 0.0.0.0 # IP to bind too
  port: 4180  # Port to bind too

Thanks,

Chris

Hi Chris,

That is very odd. Linkerd doesn’t set any special headers when talking to the etcd API, just query parameters. So, unfortunately, I don’t have an explanation as to why etcd seems to be responding differently to what appears to be the same request.

Please do update this topic if you figure it out, though!

Thanks @Alex for your reply.

It would be good if you’re logging was more verbose in stack traces, so we could see request headers etc etc to try and replicate the exact same request so we can reproduce the error to aid debugging.

Do you have a recommended version of etcd you guys use internally you know works with namerd?

Thanks,

Chris

@Alex I installed tcpflow on one of my namerd containers and ran a namerctl request against it for dtabs, this was the response from ectd:

010.056.000.031.50276-010.056.001.030.02379: GET /v2/keys/namerd/dtabs?recursive=true HTTP/1.1
Accept: application/json
X-B3-TraceId: f6277069f24295be
X-B3-SpanId: 7df74d2a46384724
X-B3-ParentSpanId: f6277069f24295be
X-B3-Sampled: false
X-B3-Flags: 0
Finagle-Ctx-com.twitter.finagle.Retries: 0


010.056.001.030.02379-010.056.000.031.50276: HTTP/1.1 400 Bad Request: missing required Host header
Content-Type: text/plain; charset=utf-8
Connection: close

400 Bad Request: missing required Host header

Looks like namerd is sending a request without a Host header?

Here is a request response from etcd when I try and update a dtab:

010.056.000.031.53846-010.056.001.030.02379: PUT /v2/keys/namerd/dtabs/dev-in HTTP/1.1
Host: host
Content-Length: 70
Content-Type: application/x-www-form-urlencoded
X-B3-TraceId: 4f0726ce5cc6277f
X-B3-SpanId: 19e09a2a59e5f26e
X-B3-ParentSpanId: 4f0726ce5cc6277f
X-B3-Sampled: false
X-B3-Flags: 0
Finagle-Ctx-com.twitter.finagle.Retries: 0

value=%2Fsrv%3D%3E%2F%23%2Fio.l5d.k8s%2Fqa%2Fgrpc%3B%2Fsvc%3D%3E%2Fsrv
010.056.001.030.02379-010.056.000.031.53846: HTTP/1.1 200 OK
Content-Type: application/json
X-Etcd-Cluster-Id: acde95247f205b11
X-Etcd-Index: 15
X-Raft-Index: 24
X-Raft-Term: 8
Date: Tue, 14 Nov 2017 11:35:18 GMT
Content-Length: 288

{"action":"set","node":{"key":"/namerd/dtabs/dev-in","value":"/srv=\u003e/#/io.l5d.k8s/qa/grpc;/svc=\u003e/srv","modifiedIndex":15,"createdIndex":15},"prevNode":{"key":"/namerd/dtabs/dev-in","value":"/srv=\u003e/#/io.l5d.k8s/dev/grpc;/svc=\u003e/srv","modifiedIndex":9,"createdIndex":9}}

namerd sets the Host header to host.

I am guessing that https://github.com/linkerd/linkerd/blob/master/etcd/src/main/scala/io/buoyant/etcd/Etcd.scala#L46 should also get passed through requestBuilder.url as POST | PUT.

According the Finagle documentation, url will set the HOST header for a request: https://twitter.github.io/finagle/docs/com/twitter/finagle/http/RequestBuilder.html#url(u:java.net.URL):com.twitter.finagle.http.RequestBuilder[com.twitter.finagle.http.RequestConfig.Yes,HasForm].

Since etcd is written in Go I went and dug around there and found this:

https://go-review.googlesource.com/c/go/+/17892 which was merged in December 2015 and likely released in a go version later in 2016.

etcd likely updated their go version and thus a requirement for a Host header for HTTP/1.1 requests was added probably without anyone knowing. Since a Host header is part of the HTTP/1.1 spec I still feel this is a bug in namerd since requests should be sent with a Host header.

I’m going to see if I can run a proxy as a sidecar to etcd that can force a host header, can’t be written in go tho lol.

Ah, great find! Thanks for looking into this. Would you mind filing an issue on the Linkerd project in Github for setting the Host header on requests to etcd? We’ll try to get this fixed as quickly as possible.

@Alex yup done here http://bit.ly/2zZsK6B :slight_smile:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.