-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suspected off-by-one error when bucketing histogram values #488
Comments
Thank you for the thorough analysis. We take I think it would make sense to fix the issue there and then backport it, unless you think it's somehow specific to ebpf_exporter. |
Well, that's the thing - I don't know I'd say I personally think the right move would be adjust the two |
See cloudflare#488 for more details. With `exp2` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="0"} 1 poc_metric_bucket{le="1"} 2 poc_metric_bucket{le="2"} 3 poc_metric_bucket{le="4"} 5 poc_metric_bucket{le="8"} 9 poc_metric_bucket{le="+Inf"} 9 poc_metric_sum 36 poc_metric_count 9 ``` * `ebpf_exporter`: ``` ebpf_exporter_poc_exp2zero_values_bucket{le="0"} 1 ebpf_exporter_poc_exp2zero_values_bucket{le="1"} 2 ebpf_exporter_poc_exp2zero_values_bucket{le="2"} 3 ebpf_exporter_poc_exp2zero_values_bucket{le="4"} 5 ebpf_exporter_poc_exp2zero_values_bucket{le="8"} 9 ebpf_exporter_poc_exp2zero_values_bucket{le="+Inf"} 9 ebpf_exporter_poc_exp2zero_values_sum 36 ebpf_exporter_poc_exp2zero_values_count 9 ``` With `exp2zero` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="1"} 1 poc_metric_bucket{le="2"} 2 poc_metric_bucket{le="4"} 4 poc_metric_bucket{le="8"} 8 poc_metric_bucket{le="+Inf"} 8 poc_metric_sum 36 poc_metric_count 8 ``` * `ebpf_exporter` ``` ebpf_exporter_poc_exp2_values_bucket{le="1"} 1 ebpf_exporter_poc_exp2_values_bucket{le="2"} 2 ebpf_exporter_poc_exp2_values_bucket{le="4"} 4 ebpf_exporter_poc_exp2_values_bucket{le="8"} 8 ebpf_exporter_poc_exp2_values_bucket{le="+Inf"} 8 ebpf_exporter_poc_exp2_values_sum 36 ebpf_exporter_poc_exp2_values_count 8 ```
See cloudflare#488 for more details. With `exp2zero` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="0"} 1 poc_metric_bucket{le="1"} 2 poc_metric_bucket{le="2"} 3 poc_metric_bucket{le="4"} 5 poc_metric_bucket{le="8"} 9 poc_metric_bucket{le="+Inf"} 9 poc_metric_sum 36 poc_metric_count 9 ``` * `ebpf_exporter`: ``` ebpf_exporter_poc_exp2zero_values_bucket{le="0"} 1 ebpf_exporter_poc_exp2zero_values_bucket{le="1"} 2 ebpf_exporter_poc_exp2zero_values_bucket{le="2"} 3 ebpf_exporter_poc_exp2zero_values_bucket{le="4"} 5 ebpf_exporter_poc_exp2zero_values_bucket{le="8"} 9 ebpf_exporter_poc_exp2zero_values_bucket{le="+Inf"} 9 ebpf_exporter_poc_exp2zero_values_sum 36 ebpf_exporter_poc_exp2zero_values_count 9 ``` With `exp2` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="1"} 1 poc_metric_bucket{le="2"} 2 poc_metric_bucket{le="4"} 4 poc_metric_bucket{le="8"} 8 poc_metric_bucket{le="+Inf"} 8 poc_metric_sum 36 poc_metric_count 8 ``` * `ebpf_exporter` ``` ebpf_exporter_poc_exp2_values_bucket{le="1"} 1 ebpf_exporter_poc_exp2_values_bucket{le="2"} 2 ebpf_exporter_poc_exp2_values_bucket{le="4"} 4 ebpf_exporter_poc_exp2_values_bucket{le="8"} 8 ebpf_exporter_poc_exp2_values_bucket{le="+Inf"} 8 ebpf_exporter_poc_exp2_values_sum 36 ebpf_exporter_poc_exp2_values_count 8 ```
I opened #515 to address this. Let me know what you think. |
See cloudflare#488 for more details. With `exp2zero` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="0"} 1 poc_metric_bucket{le="1"} 2 poc_metric_bucket{le="2"} 3 poc_metric_bucket{le="4"} 5 poc_metric_bucket{le="8"} 9 poc_metric_bucket{le="+Inf"} 9 poc_metric_sum 36 poc_metric_count 9 ``` * `ebpf_exporter`: ``` ebpf_exporter_poc_exp2zero_values_bucket{le="0"} 1 ebpf_exporter_poc_exp2zero_values_bucket{le="1"} 2 ebpf_exporter_poc_exp2zero_values_bucket{le="2"} 3 ebpf_exporter_poc_exp2zero_values_bucket{le="4"} 5 ebpf_exporter_poc_exp2zero_values_bucket{le="8"} 9 ebpf_exporter_poc_exp2zero_values_bucket{le="+Inf"} 9 ebpf_exporter_poc_exp2zero_values_sum 36 ebpf_exporter_poc_exp2zero_values_count 9 ``` With `exp2` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="1"} 1 poc_metric_bucket{le="2"} 2 poc_metric_bucket{le="4"} 4 poc_metric_bucket{le="8"} 8 poc_metric_bucket{le="+Inf"} 8 poc_metric_sum 36 poc_metric_count 8 ``` * `ebpf_exporter` ``` ebpf_exporter_poc_exp2_values_bucket{le="1"} 1 ebpf_exporter_poc_exp2_values_bucket{le="2"} 2 ebpf_exporter_poc_exp2_values_bucket{le="4"} 4 ebpf_exporter_poc_exp2_values_bucket{le="8"} 8 ebpf_exporter_poc_exp2_values_bucket{le="+Inf"} 8 ebpf_exporter_poc_exp2_values_sum 36 ebpf_exporter_poc_exp2_values_count 8 ```
See cloudflare#488 for more details. With `exp2zero` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="0"} 1 poc_metric_bucket{le="1"} 2 poc_metric_bucket{le="2"} 3 poc_metric_bucket{le="4"} 5 poc_metric_bucket{le="8"} 9 poc_metric_bucket{le="+Inf"} 9 poc_metric_sum 36 poc_metric_count 9 ``` * `ebpf_exporter`: ``` ebpf_exporter_poc_exp2zero_values_bucket{le="0"} 1 ebpf_exporter_poc_exp2zero_values_bucket{le="1"} 2 ebpf_exporter_poc_exp2zero_values_bucket{le="2"} 3 ebpf_exporter_poc_exp2zero_values_bucket{le="4"} 5 ebpf_exporter_poc_exp2zero_values_bucket{le="8"} 9 ebpf_exporter_poc_exp2zero_values_bucket{le="+Inf"} 9 ebpf_exporter_poc_exp2zero_values_sum 36 ebpf_exporter_poc_exp2zero_values_count 9 ``` With `exp2` histogram: * Go: ``` $ go run cmd/derp/main.go poc_metric_bucket{le="1"} 1 poc_metric_bucket{le="2"} 2 poc_metric_bucket{le="4"} 4 poc_metric_bucket{le="8"} 8 poc_metric_bucket{le="+Inf"} 8 poc_metric_sum 36 poc_metric_count 8 ``` * `ebpf_exporter` ``` ebpf_exporter_poc_exp2_values_bucket{le="1"} 1 ebpf_exporter_poc_exp2_values_bucket{le="2"} 2 ebpf_exporter_poc_exp2_values_bucket{le="4"} 4 ebpf_exporter_poc_exp2_values_bucket{le="8"} 8 ebpf_exporter_poc_exp2_values_bucket{le="+Inf"} 8 ebpf_exporter_poc_exp2_values_sum 36 ebpf_exporter_poc_exp2_values_count 8 ```
Hello!
I think I may have found an off-by-one error in the various histogram macros provided by
maps.bpf.h
- the tl;dr is that when I record a value in anexp2
histogram, I see that value reflected in the bucket below or equal to that value, rather than the first bucket above to that value.For example, let's say I have buckets for 1, 2, 4, 8, ..., 1024, and I record the value 268. I would expect the buckets corresponding to
le="5.12e-07"
andle="1.024e-06"
to have their values incremented by one - but I am also seeing the bucket forle="2.56e-07"
get incremented.Minimal Example
Here is a fairly minimal example that demonstrates this by just inserting the values 1-8 into an
exp2
histogram (I usedtp/sched/sched_process_exec
as a proxy for an event that will happen almost immediately upon startup - my BPF knowledge is mostly limited tobpftrace
and I didn't see an analogue to itsBEGIN
block that I could use):...and the output of doing a
curl -s http://localhost:9435/metrics | grep ebpf_exporter_poc_values_bucket
againstsudo ./ebpf_exporter --config.dir=examples --config.names=off_by_one
:By contrast, here's a Go program that creates a Prometheus histogram and observes values 1-8:
...and the resulting output:
Suspected Cause
I think what's happening here is that
ebpf_exporter
is usinglog2l
to determine the bucket index, and since it's performing integer arithmetic,log2l
is discarding the fractional part of the result toward zero, solog(3) / log(2) = 1.5849625007211563
ends up as1
, but it should be2
.The text was updated successfully, but these errors were encountered: