Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental CAgg Refresh Policy #7790

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

fabriziomello
Copy link
Contributor

@fabriziomello fabriziomello commented Mar 4, 2025

Nowadays a Continuous Aggregate refresh policy process everything only once independent of how large the refresh window is. For example if you have a hypertable with a huge amount of rows it can take a lot of time and requires a lot of resources in terms of CPU, Memory and I/O to refresh a CAgg, and all the aggregated data will be visible for the users only when the refresh policy complete it execution.

This PR add the capability of a CAgg refresh policy be executed incrementaly in "batches". Each "batch" is an individual transaction that will process a small fraction of the entire refresh window, and once the "batch" finishes the execution the data refreshed will already be visible for the users even before policy execution end.

To tweak and control the incremental refresh some new options was added to add_continuous_aggregate_policy API:

  • buckets_per_batch: number of buckets to be refreshed by a "batch". To summarize this value is multiplied by the CAgg bucket width to determine the size of the batch range. Default value is 0 (zero) that means it will keep the current behavior of single batch execution. Values less than 0 (zero) are not allowed.
  • max_batches_per_execution: maximum number of batches to be executed by a policy execution. This option is used to limit the number of batches processed by a single policy execution, so if some batches remain next time the policy run they will be processed. Default value is 10 (ten) that means that each job execution will process the maximum of ten batches. Values less than 0 (zero) are not allowed.

Copy link

codecov bot commented Mar 4, 2025

Codecov Report

Attention: Patch coverage is 81.72043% with 34 lines in your changes missing coverage. Please review.

Project coverage is 81.88%. Comparing base (59f50f2) to head (3ee29cd).
Report is 806 commits behind head on main.

Files with missing lines Patch % Lines
tsl/src/continuous_aggs/refresh.c 77.39% 9 Missing and 17 partials ⚠️
tsl/src/bgw_policy/continuous_aggregate_api.c 85.71% 2 Missing and 1 partial ⚠️
tsl/src/bgw_policy/job.c 89.65% 0 Missing and 3 partials ⚠️
src/ts_catalog/continuous_agg.c 80.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7790      +/-   ##
==========================================
+ Coverage   80.06%   81.88%   +1.81%     
==========================================
  Files         190      247      +57     
  Lines       37181    45627    +8446     
  Branches     9450    11418    +1968     
==========================================
+ Hits        29770    37361    +7591     
- Misses       2997     3767     +770     
- Partials     4414     4499      +85     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@fabriziomello fabriziomello force-pushed the cagg_refresh_policy_incremental branch 10 times, most recently from 81f49e3 to b697dae Compare March 5, 2025 22:40
@fabriziomello fabriziomello added this to the v2.19.0 milestone Mar 5, 2025
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. Since it is in draft, I will wait with approving until you have the final version.

@fabriziomello fabriziomello force-pushed the cagg_refresh_policy_incremental branch from 4e90f1e to d976191 Compare March 6, 2025 23:59
@fabriziomello fabriziomello marked this pull request as ready for review March 7, 2025 00:14
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions regarding some parts of the code where I am not sure if it is correct or not.

@fabriziomello fabriziomello force-pushed the cagg_refresh_policy_incremental branch from d865281 to b139fa1 Compare March 7, 2025 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants