We moved one of our downstream services (PHP) to use the latest Thrift 0.11.0 recently. Connections to it are done via linkerd from Python Thrift clients (which are running 0.9.0 version of the library). Prior to the upgrade, we were using Thrift 0.9.0.
After the upgrade, things mostly worked successfully. We didn’t have catastrophic failures. However, over a period of 24 hours we begin to see a consistent spike of ChannelClosedException for connections to this server.
I am just wondering if anybody around is running Thrift servers using the 0.11.0 version and have had any issues?
Hi @amitsaha, we haven’t seen anything like this before. Consider testing the Python Thrift 0.9.0 service connecting directly to the PHP Thrift 0.11.0 service? I’m wondering if this issue is due to the Thrift upgrade and not a Linkerd issue, so let’s rule that out first.
Okay turns out there was an issue with the our PHP Thrift server. We didn’t set the send and receive timeout values explicitly when we switched to 0.11. Our earlier internal forked version had been patched manually (yay!) to have the timeouts and we missed it.