Add schema_id
, config_hash
attributes to the apollo_router_session_count_total
metric
#6527
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds attributes to the
apollo_router_session_count_total
metric, so customers can see if they have sessions that are keeping alive references to large amounts of previous router state after a configuration or schema change. This would help diagnose potential memory usage issues and inform our approach to dealing with that.To do this the
session_count_total
instrument is refactored a bit; it was previously using a single global state value, now there can be multiple total session counts with different attributes alive at the same time.For the
schema_id
, I just had to thread it through several layers of functions. For the config, I "invented" a new config hash (sha256 of JSON-serialized value).This also fixes a bug that was introduced in 797a3be. In that commit the total session count is changed from a manual counter to a guard, but the guard was not moved into the tokio task properly, so it would immediately decrement. That change has not made it to a release luckily.
I'll mark this as a draft to add some tests for the metric to validate both this PR and the basic behaviour.
Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Note any exceptions here
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩