Help needed - Linkerd com.twitter.finagle.stats.Stat values not displaying inside /admin/metrics.json

Hello,

I am a senior in the Computer Science department at University of San Francisco. I am working on an academic project using Linkerd for our service mesh solution called Hydration.

Inside my service I have a dozen metrics for reporting some service specific numbers. I have developed my Linkerd plugin that accepts a com.twitter.finagle.stats.StatsReceiver object in my custom namer and then this object is passed to my service object where I instantiate a com.twitter.finagle.stats.Counter object via the following call

com.twitter.finagle.stats.Counter counter = this.statsReceiver.counter(getSeq(“numberOfRetries”));

Then inside the relevant methods I increment this counter like this ->

com.twitter.finagle.stats.Counter counter = getServiceMetricsCounter(svc, “numberOfRetries”);
if (counter != null) {
counter.incrementAndGet();
}

When I run Linkerd, I can see my counter metrics as per my provided config id for the custom Linkerd plugin namer inside /admin/metrics.json as ->
“namer/#/com.usf.twitter.hydration/numberOfRetries” : 12,

I get similar results for other metrics for which I use counters. However I am facing a severe difficulty in 2 cases ::==>

  1. I want to get percentiles/histogram for my time series metrics, specifically these ones (telemetry-core-src-main-scala-io-buoyant-telemetry-Metric.scala#L54-L64). I can see these metrics as an example inside /admin/metrics.json when I run Linkerd:
    “rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.count" : 2648, "rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.max” : 529, “rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.min" : 0, "rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p50” : 246,
    “rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p90" : 575, "rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p95” : 529,
    “rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p99" : 615, "rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p9990” : 429,
    “rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.p9999" : 529, "rt/outgoing_http2/client//inet/kubeservice.com/16024/request_latency_ms.sum” : 1173897,
    “rt/outgoing_http2/client/$/inet/kubeservice.com/16024/request_latency_ms.avg” : 254.07148219833456

I see that you have defined a stat object for ‘request_latency_ms’ metric here (https://github.com/linkerd/linkerd/blob/master/router/h2/src/main/scala/io/buoyant/router/h2/StreamStatsFilter.scala#L132). Then you apply an increment to this object here (https://github.com/linkerd/linkerd/blob/master/router/h2/src/main/scala/io/buoyant/router/h2/StreamStatsFilter.scala#L179).

I do exactly something similar

com.twitter.finagle.stats.Stat stat = this.statsReceiver.stat(getSeq(“retryLatencyInMillisecs”));

followed by

com.twitter.finagle.stats.Stat stat = getServiceMetricsStat(svc, “retryLatencyInMillisecs”);
if (stat != null) {
stat.add(val); // where val = 25.0f or something
}

However, when I run Linkerd I don’t see these com.twitter.finagle.stats.Stat metrics inside /admin/metrics.json as I do see it for com.twitter.finagle.stats.Counter metrics. Why? How can I get those Stat metrics to get percentiles inside /admin/metrics.json?

I realize that those histogram based percentiles come from Stat defined inside io.buoyant.telemetry.Metric (telemetry-core-src-main-scala-io-buoyant-telemetry-Metric.scala#L22-L70). I don’t want to add linkerd dependency inside my service to avoid any circular dependency.

I was thinking along the following lines ->

class HydrationNamer(
server: String,
srStats: StatsReceiver = NullStatsReceiver,
metrics: MetricsTree
) extends Namer {

private[this] val log = Logger.get(“com.usf.twitter.hydration”)
val cfg: Hydration.conf = new Hydration.conf(srStats, server)
Hydration.init(cfg)
val hydration: Hydration = new Hydration()

val m = metrics.mkStat(Verbosity.Default)
// Now how do I link srStats.stat(“My Metric”) with metrics.mkStat(Verbosity.Default) in order to see my stat metrics and percentile inside /admin/metrics.json???

  1. This is a simple follow up to the previous question. Some of my metrics are floating point or double. How do I display floating point metrics just like I do for counters inside /admin/metrics.json? I don’t want to simply add +1 as I do for counters, I want to specifically display values at a particular time or I can only achieve that using your Linkerd Metric/Stat? Do I need to use a Gauge from StatsReceiver for that? If yes, do you have an example for that case?

Waiting for your earliest repy.

Yours sincerely,
Derek
University of San Francisco

Hi,

In regards to your first question, I’m not quiet sure why you don’t see stats. One thing I do want to note is, that we don’t expose raw stats, but we expose histograms. Therefore new stats don’t show up immediately, but will need to wait about 1minute for stats to be aggregated. Have you waited a bit before checking if they are being displayed?

In regards to your second question, using a Gauge should help you solve it.

Good luck!

Hi Franzi,

So, it seems that I was using the same string for both the counter and stat for a specific metric that was causing the issue. I kept different strings to instantiate a counter and a stat object and it worked. However, I get the following results now being shown inside /admin/metrics.json.

"namer/#/com.usf.twitter.hydration/numberOfRetriesCounter” : 12,
“namer/#/com.usf.twitter.hydration/numberOfRetriesStat.count” : 0,
"namer/#/com.usf.twitter.hydration/numberOfConnectionsCounter” : 8,
“namer/#/com.usf.twitter.hydration/numberOfConnectionsStat.count” : 0,
“namer/#/com.usf.twitter.hydration/numberOfRetriesCounter” : 22,
"namer/#/com.usf.twitter.hydration/numberOfRetriesStat.count” : 0,

  1. For counter I see the corresponding result for counter.incr() but for the stat when I do stat.add(1.0f) in the same method that calls counter.incr(), I always get the value of stat.count = 0 (tested it from 1 hour to 24 hours period, stat.count is always 0)? Why? What does stat.count represent exactly at any point of time?

  2. Also, I don’t see the p50, p95, p99, p999, p9999, avg, sum, and max percentiles, just stat.count. How do I see the following histogram percentiles (https://github.com/linkerd/linkerd/blob/7dd505d9443b1fcd972235f0b69eb8b6bb9737e1/admin/src/main/resources/io/buoyant/admin/js/spec/fixtures/metrics.js#L1007-L1016) that seem to come from Linkerd Stat defined here (https://github.com/linkerd/linkerd/blob/master/telemetry/core/src/main/scala/io/buoyant/telemetry/Metric.scala#L76-L87)? I want to see stat.count, stat.p50, stat.p99, etc. histogram percentiles inside /admin/metrics.json.

I instantiate the counter and stat from the statsReceiver passed to my service object from my Linkerd Namer plugin the same way.

com.twitter.finagle.stats.Counter counter = this.statsReceiver.counter(getSeq(“numberOfRetriesCounter”)); com.twitter.finagle.stats.Stat stat = this.statsReceiver.stat(getSeq(“numberOfRetriesStat”));

public void registerRetry(final String svc) {
com.twitter.finagle.stats.Counter counter = getServiceMetricsCounter(svc, “numberOfRetriesCounter”);
if (counter != null) {
counter.incr();
}

com.twitter.finagle.stats.Stat stat = getServiceMetricsStat(svc, “numberOfRetriesStat”);
if (stat != null) {
	stat.add(1.0f);
}

}

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.