Configure traffic split with Nginx ingress controller

I’m evaluating Linkerd as a service mesh for our company project, wasn’t able to setup traffic split with nginx ingress, here is my setup https://gist.github.com/kopachevsky/fe72344c0b04d606a3175f8197d3319e

first question: do I need to inject linkerd proxy to ingress controller pods itself (I’ve tried both options), but traffic always goes to frontend-v1 service if I do call thru public ingress.

If I do internal call from test pod traffic split works well.

Hi @kopachevsky, this is a good question.

The Ingress controller does need to be injected with the Linkerd proxy because the traffic split happens on the client side.

I looked at gist and it looks like the definition of service-v1.yaml and service-v2.yaml are the same. Can you check to make sure that you uploaded the right file?

I’m working to reproducing this with the files that you provided.

@kopachevsky, I updated the service-v2.yaml file with the contents below and used kubectl exec to attach to an nginx-ingress-controller deployment in my environment. From that container, I ran while true; do curl http://frontend-v1.ex:8080; sleep 1; done and saw that V1 and V2 output split intermittently.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
  labels:
    app: frontend-v2
  name: frontend-v2
  namespace: ex
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: frontend-v2
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        linkerd.io/inject: enabled
      labels:
        app: frontend-v2
    spec:
      containers:
      - image: nginx:alpine
        imagePullPolicy: IfNotPresent
        name: nginx
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/nginx/nginx.conf
          name: cfg
          subPath: nginx.conf
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: frontend-v2
        name: cfg

Give this a shot and let me know if you still see unexpected behavior

@cpretzer sorry for this mistake, I’ve copy pasted same code for v1 and v2, updated the gist now, now will do same test you did, run curl from nginx controller

@cpretzer I’ve repeated your test, from nginx controller pod I get even split:

bash-5.0$  while true; do curl http://frontend-v1.ex:8080; sleep 1; done
V1
V2
V2
V2
V1
V1
V2
V1
V1
V1
V2
V1

But if do same from public endpoint attached to ingress gateway:

while true; do curl http://$PUBLIC_IP; done
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1

Here is ingress config:

kubectl describe ing hello-world-ingress -n ex
Name:             hello-world-ingress
Namespace:        ex
Address:          10.0.0.5
Default backend:  default-http-backend:80 (<none>)
Rules:
  Host  Path  Backends
  ----  ----  --------
  *
           frontend-v1:8080 (10.0.0.35:8080)

Is it expected behaviour?

@kopachevsky, that is really unexpected.

I ran a similar test and had different results, although I specified a host header for the ingress. Here is the ingress definition and the command that I used:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header l5d-dst-override $service_name.$namespace.svc.cluster.local:$service_port;
      grpc_set_header l5d-dst-override $service_name.$namespace.svc.cluster.local:$service_port;
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
  name: hello-world-ingress
  namespace: ex
spec:
  rules:
  - host: ts.linkerd.test
    http:
      paths:
      - backend:
          serviceName: frontend-v1
          servicePort: 8080
status:
  loadBalancer:
    ingress:
    - ip: 10.0.0.98
while true; do curl -v -H "HOST: ts.linkerd.test" http://$PUBLIC_IP; sleep 1; done

Can you try with the host header?

Thanks, will try to rebuild cluster from scratch as well, what about status.loadBalaner config, should I use it also?

@kopachevsky, no you can disregard that. Kubernetes won’t try to set the status from a yaml file. In other words, kubernetes will assign its own status to the resource that it creates.

@kopachevsky did you have any luck when you added the HOST header?

it works with hosts header! but it was not work before I injected linkerd proxy into nginx controller pod. but anyway it’s clearly works now, thanks!

That’s good to hear. Please let us know if you have any additional questions. :slight_smile:

I have a similar problem to what you documented with this thread.

I documented my issue here, which looks similar to your configuration:

I am trying to figure out your fix “not work before I injected linkerd proxy into nginx controller pod”, did you take the nginx controller in kube-system and inject a proxy for it, to get this to work?

❯ k -n kube-system get pods | grep nginx-controller
NAME READY STATUS RESTARTS AGE
ingress-nginx-controller-7bb4c67d67-g5v64 1/1 Running 0 2d21h

Here is the gist with the configuration of Ingress:

hi @seizadi can you describe a bit more that the behavior that you’re seeing?

Are you sending curl requests with the host header?

Yes there is curl call running every second that displays, the host that is responding behind Ingress and I don’t see any from the canary, even though the control plane shows the mix. I have a local traffic generator connected and I do see from the Linkerd dashboard that its traffic is being split, so it is only a problem with Ingress.

watch -n 1 curl minikube
Every 1.0s: curl minikube sc-l-seizadi-2.local: Fri Jul 17 10:40:48 2020

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0
100 369 100 369 0 0 92250 0 --:–:-- --:–:-- --:–:-- 92250
{
“hostname”: “podinfo-primary-5d8f7b98f8-bdfgj”,
“version”: “3.1.1”,
“revision”: “7b6f11780ab1ce8c7399da32ec6966215b8e43aa”,
“color”: “#34577c”,
“logo”: “https://eks.handson.flagger.dev/cuddle_bunny.gif”,
“message”: “greetings from podinfo v3.1.1”,
“goos”: “linux”,
“goarch”: “amd64”,
“runtime”: “go1.13.1”,
“num_goroutine”: “8”,
“num_cpu”: “3”
}

@seizadi I tested your gist using kind instead of minikube and I’m seeing the different versions from the curl command. Can you reproduce this outside of minikube?

Did you install the ingress following these docs?

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   360  100   360    0     0  14121      0 --:--:-- --:--:-- --:--:-- 14400
{
  "hostname": "podinfo-958bf9ff5-tc996",
  "version": "3.1.2",
  "revision": "7b6f11780ab1ce8c7399da32ec6966215b8e43aa",
  "color": "#34577c",
  "logo": "https://eks.handson.flagger.dev/cuddle_bunny.gif",
  "message": "greetings from podinfo v3.1.2",
  "goos": "linux",
  "goarch": "amd64",
  "runtime": "go1.13.1",
  "num_goroutine": "9",
  "num_cpu": "6"
}

NAME      APEX      LEAF              WEIGHT   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
podinfo   podinfo   podinfo-canary        45   100.00%   0.3rps           2ms           2ms           2ms
podinfo   podinfo   podinfo-primary       55   100.00%   0.4rps           2ms           2ms           2ms

I picked minikube because running ingress on it is trivial and just requires enabling an addon. I will try to create nginx ingress on kind cluster and see if I can get this to work.

Sounds good, please let me know if you see the same behavior on kind.

One thought that I had is that the traffic may have all been shifted from the canary to the primary. By default, flagger will automatically shift the traffic from the canary to the primary, and it’s possible that the traffic had all been shifted over by the time that you ran the curl command.

Can you share the list of steps that you followed?

To debug this better I setup another manifest that put the system in a 50/50 split primary/canary and used much simpler deployments. I found that the traffic split worked on root service but I could not get ngix ingress to work on Minikube or Kind.

I wrote a description of the test here, starting with how to setup Kind and ngix since you have it working on your setup, may be you are setting it up differently:

Here is the description of my test that shows it working within cluster but not behind ingress:

I got this to work with Ingress:

I injected the nginx controller so that it is part of the mesh:

kubectl -n kube-system get deploy ingress-nginx-controller -o yaml | \
   linkerd inject - | \
   kubectl apply -f -

Now ingress is working with my traffic split,

❯ while true; do curl minikube/echo; sleep 1; done
primary
primary
canary
primary
canary

So if you want ingress to work you have to include the Ingress Controller as part of the mesh. Is this by design how it is suppose to work?

@seizadi glad to hear that you got the canary deployments working.

To answer your question, external traffic will not be shifted unless the ingress controller is part of the mesh because traffic is shifted on the client side. There is a note in the canary release docs about this:

Note

Traffic shifting occurs on the client side of the connection and not the server side. Any requests coming from outside the mesh will not be shifted and will always be directed to the primary backend. A service of type LoadBalancer will exhibit this behavior as the source is not part of the mesh. To shift external traffic, add your ingress controller to the mesh.

I hope this helps!