-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: it is decompressing more rows than seems to be needed while deleting #7110
Comments
Hmm I cannot reproduce with the script you provided. I'm getting no decompression for that delete.
|
Let's sync how we run reproduce script. I am launching TS through docker first, no volumes mounted, fresh db with default config:
Then I create DB:
And then I run reproduction script:
|
Probably it will be helpful to know, that for one kind of query planner works already as desired: for single row deletion, while having all clauses checking equality with literal:
It passes successfully and immediately (9ms). But even if we'll try to add more rows by using
query from above without prompt:
So probably functionality used for single row deletion can be reused for multiple rows as well? |
I got some different timestamp values so that no rows were deleted (for Sven as well), but even if I fix them, the delete doesn't decompress:
But with OR I do see the "tuple limit exceeded":
@svenklemm probably we should treat this as a feature request to support OR clauses for filtering in compressed DML? |
With
but I am trying to delete only few rows and decompression of 320 000 rows happens. In my case query with OR clause looks so
it attempts to delete 2 rows and that attempt ends with
while using
Just curious, how it can happen and how reproduction SQL mentioned above can give different timestamps and outcomes. |
The expression |
I've tried to replace '2024-01-01' with '2024-01-01 00:00:00.000+00' and nothing changed: rows of Anyway, if on your side somehow timestamps are different, you can find value so:
and then use it inside of query
And if your local config is same as bundled with docker image it's expected you to get error. |
Sorry for the confusion, I did already reproduce the excessive decompression you described, was just answering your question about the different timestamp. The reason for decompressing too much is that we don't support OR clauses for filtering what we decompress. This is something we have to improve. |
So |
This should be supported naturally or listed as a limitation because we have spent a lot of time debugging this aswell and caused us issues on production DB since it was unexpected behaviour. Docs are stating DML operations are supported for compressed chunks so one would not expect chunks exploding on delete. |
This one is "scalar array operation", the support for it should be out with 2.16, was implemented here: #6880 |
Hello @akuzm , im still seeing this behaviour on the 2.16.1 version, but im using a little bit different query. I have a table with timestamp, device_id (uuid), value columns. Compression enables, with device_id as segment by. Now if i run
I am still getting the error about that it is trying to decompress over 100k tuples. Maybe it would make sense, that it is trying to decompress the whole segment for that I cannot use a larger (over 1664) batch with the Am I doing something incorrectly? What would be the best approach to do this? Basically I need to delete a lot of records from multiple devices, but these records may not be continuous so I need to delete them by exact timestamps. |
UPDATE: If anybody stumbles upon this, I was able to reduce the delete time from 40 minutes to 55 seconds, by sorting the batches, and taking min and max, and adding it to query as
|
Have you had the chance to check out the latest versions containing improvements to this kind of DELETE query? |
What type of bug is this?
Unexpected error
What subsystems and features are affected?
Compression, Query executor, Query planner
What happened?
I have table which looks like:
and on query
I am getting error:
And it looks strange, while in query I've specified filter on
segmentby
field (device_id
) together withorderby
field (generated_at
) as well. So I can expect that it will need to decompress only three actual segments.TimescaleDB version affected
2.15.2
PostgreSQL version used
15.7
What operating system did you use?
Mac OS 14.5
What installation method did you use?
Docker
What platform did you run on?
Timescale Cloud
Relevant log output and stack trace
No response
How can we reproduce the bug?
The text was updated successfully, but these errors were encountered: