-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Bad performance / Very high CPU load when using INSERT DO UPDATE ON CONFLICT on compressed chunks #5801
Comments
Hi @mblsf, Thank you for filing the issue! But I was not able to observe the level of performance degradation that you mention in the bug. |
Forget to mention that we are using batch inserts. The values to insert can be for different chunks. For the definition of hypertable definition I need to wait for a colleague to return. Query locks like this: INSERT INTO values(key, ts_utc, value, state, id) VALUES get_id() is used to generate a unique key |
Hi @mblsf Prior 2.11.0 INSERT on compressed chunk will compress only the new row and insert into compressed chunk. From 2.11.0 Anyways, we need details on hypertable, and compression settings like what are the segmentby columns to understand better on the issue. |
Here is the requested information. Hope it's complete. If not, please reply. CREATE TABLE IF NOT EXISTS warp_tsa_core.signal_raw_values -- convert table to hypertable -- Data older than 400 days in table signal_raw_values should be deleted automatically by timescaledb -- Activates compression on signal_raw_values for data older than 30 days -- Set compress policy to automatic compress chunks older than 30 days |
@mblsf Please provide the queries you are facing issues with also. |
This query inserts 153 new entries, it can be up to 1000 entries per query: INSERT INTO signal_raw_values(signal_key, raw_ts_utc, raw_value_num, raw_value_string, raw_state_name, raw_good_flag, load_id) VALUES (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id()), (%s, %s, %s, %s, %s, %s, get_load_id())ON CONFLICT (signal_key, raw_ts_utc, raw_dupl_id) DO UPDATE SET raw_value_num = EXCLUDED.raw_value_num, raw_value_string = EXCLUDED.raw_value_string, raw_state_name = EXCLUDED.raw_state_name, raw_good_flag = EXCLUDED.raw_good_flag We don't have any problems when executing this query with uncompressed chunks. |
Any updates yet? Do you need more information? |
Maybe our chunk size is too large. What's the recommended uncompressed chunk size for TimescaleDB? |
@lkshminarayanan, @sb230132, @RafiaSabih Can someone of you please have a look at this issue again? I think we've recently provided all necessary information. |
@mblsf Thank you for providing the information, especially the segment-by and order-by columns that you use for compression. Given this information, I do not think that a large chunk size is is the reason for bad performance. Could you run the following queries to get some statistics on the columns? SELECT schemaname, tablename, attname, n_distinct
FROM pg_stats
WHERE schemaname = 'warp_tsa_core'
AND tablename = 'signal_raw_values'
AND inherited; |
Here is the issue which should fix the performance in INSERTS ... ON CONFLICT queries on compressed chunks. |
The INSERT query with ON CONFLICT on compressed chunk does a heapscan to verify unique constraint violation. This patch improves the performance by doing an indexscan on compressed chunk, to fetch matching records based on segmentby columns. These matching records are inserted into uncompressed chunk, then unique constraint violation is verified. Since index on compressed chunk has only segmentby columns, we cannot do a point lookup considering orderby columns as well. This patch thus will result in decompressing matching records only based on segmentby columns. Fixes #6063 Fixes #5801
@mkindahl Here are the results for your query, if still relevant: "warp_tsa_core" "signal_raw_values" "signal_key" 22309 |
The INSERT query with ON CONFLICT on compressed chunk does a heapscan to verify unique constraint violation. This patch improves the performance by doing an indexscan on compressed chunk, to fetch matching records based on segmentby columns. These matching records are inserted into uncompressed chunk, then unique constraint violation is verified. Since index on compressed chunk has only segmentby columns, we cannot do a point lookup considering orderby columns as well. This patch thus will result in decompressing matching records only based on segmentby columns. Fixes #6063 Fixes #5801
Thank you. It seems you picked a good segment-by key, and I just wanted to eliminate that you had picked a bad segment-by key. |
@mkindahl You're welcome. Please let me know when there are any updates related to this issue. |
Hi @mkindahl any updates? |
Using 2.13 on PG15 in AIVEN/MST. I have not measured the speed difference of INSERT ON CONFLICT etc but I what I have observed is that UNIQUE INDEXES are not observed. It allows for duplicate rows to be inserted as per my recent comments in #6302 Re comments from @sb230132 :
I'm not sure the decompression is occurring prior to the insertion into a compressed chunk - I think this because the unique index is not being triggered and I am ending up with duplicate rows. One row in the compressed chunk, and one duplicate in the uncompressed part of the chunk. Once the duplicate row is in there, decompression of chunk into the uncompressed chunk then fails because the UNIQUE INDEX borks on the duplicate row. |
Is DML on compressed chunks fundamentally broken??? It would seems so to me 😢 |
There is a bug preventing this from happening with unique expression indexes, for non-expression indexes it should work in 2.13. |
INSERT ON CONFLICT DO UPDATE received optimizations recently that should have helped with this. They are available with version 2.17 Have you had the chance to try it out? |
It worked, thanks :) |
What type of bug is this?
Performance issue
What subsystems and features are affected?
Compression
What happened?
When using the new feature INSERT DO UPDATE ON CONFLICT on compressed chunks that contain a lot of data. the performance is really bad (Processing 100 records takes 4 seconds instead of 40 ms) and the CPU load on postgres is always very high during this operation.
TimescaleDB version affected
2.11.0
PostgreSQL version used
14.8
What operating system did you use?
Ubuntu 22.04 x64
What installation method did you use?
Docker
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
The text was updated successfully, but these errors were encountered: