Approximate Percentile #35
Replies: 2 comments 3 replies
-
The proposed solution of using a T-Digest should work great for any arbitrary precision numeric types (decimal, double precision, etc). However, it's not clear that this is a great choice for integer types. Should we consider using a different underlying abstraction for these types? Does anyone have any suggestions off the top of their head for a better solution? |
Beta Was this translation helpful? Give feedback.
-
I would like to strongly push for the use of UDDSketch which build upon the DDSketch from Datadog. That is was we are using for graphmetrics.io and currently doing most of the merge and approximation work using SQL functions. DDSketch is great for unbounded series and UDDSketch allows for bounded series while keeping a fixed relative error (increasing with each downsampling obviously)
UDDSketch:
I wrote a more detailed explanation here: #41 |
Beta Was this translation helpful? Give feedback.
-
Discussion for Approximate Percentile
Original issue
**What's the functionality you would like to add** An approximate percentile function such as [t-digest](https://github.com/tdunning/t-digest). This would have two main advantages over `percentile_cont`:How would the function be used
Basic percentile calculation works just like exat percentile calculation, expect it takes in an accuracy measure (for t-digest the number of centroids)
When storing the data, for instance in continuous aggregates, the digest itself can be stored, instead of the percentile, allowing future analysis to chose which data it wants to get out.
Open Questions
TBD
Beta Was this translation helpful? Give feedback.
All reactions