-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics Discussion #2211
Comments
I like the idea of a general health trend. Host companies should avoid trying to drive these down to 0, that won't be possible, but instead use this as an indicator of patterns and would be a use case for control chart of ai based anomaly detection (not provided by Prebid :) )
I'd like to see a more specific idea of what you have in mind for general and request alerts. For example, we already have request errors by endpoint - how would this be different? Might it be more useful for slightly more detailed buckets to give a better idea as to the source of the error? We can add more so long as there is no account or adapter cardinality. I also like the idea of giving guidance for how long to potentially keep metrics, but that's purely up to the host company to configure. None of the metrics systems supported by PBS-X allow for a ttl. |
I was thinking that we wouldn't start out moving existing metrics so much as having a place to put new alert metrics. For example, several of the recent PRDs define edge cases for data validation. Last thing we need is a separate alert for "floor vendor's JSON doesn't contain a required field". Here are some recent mentions of metrics in PRDs:
It was pointed out in the last meeting that we already have places to put errors:
So to flesh out the proposal more, I propose:
I would move the some existing metrics into alert.general:
While reviewing metrics, a new one was requested:
|
Prebid Server has lots of operational metrics. Some would say too many. PBS-Java's metrics are at https://github.com/prebid/prebid-server-java/blob/master/docs/metrics.md
Towards rationalizing the set of metrics, here's a propose framework that divides them into three types:
A key issue with metrics is the load on the metrics database: tracking metrics at a granular level can be expensive. There are large number of combinations of accountsXadapters, and with a high volume of traffic, keeping metrics for all combinations can become expensive. We've addressed part of this combinatorial explosion by turning account-level metrics off by default.
For this thread, I'd like to propose that 'data quality' metrics don't need to be detailed. Data quality issues should be in logs because they often require several fields to provide the info necessary for debugging. So really all we need is a general alert that lets operational staff know that it's time to go look in the logs. In fact, host companies with advanced log systems wouldn't even need metrics.
So as a matter of general error-reporting, I'd propose that we start placing data-quality metrics in a small number of buckets:
Looking forward to community input.
The text was updated successfully, but these errors were encountered: