Slow insert on compressed chunks #6063

sb230132 · 2023-09-12T07:58:20Z

What type of bug is this?

Performance issue

What subsystems and features are affected?

Compression

What happened?

The INSERT query with ON CONFLICT on compressed chunk is very slow.

TimescaleDB version affected

2.12

PostgreSQL version used

15.3

What operating system did you use?

MacOSX

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

tsdb=> explain analyze insert into testdata (timestamp, idSite, code, instance, valuefloat) values (1689336100, 139400, 'V', 0, 3.3) on conflict do nothing;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Custom Scan (HypertableModify)  (cost=0.00..0.01 rows=1 width=152) (actual time=176.485..176.487 rows=0 loops=1)
   ->  Insert on testdata  (cost=0.00..0.01 rows=1 width=152) (actual time=176.485..176.487 rows=0 loops=1)
         Conflict Resolution: NOTHING
         Tuples Inserted: 1
         Conflicting Tuples: 0
         ->  Custom Scan (ChunkDispatch)  (cost=0.00..0.01 rows=1 width=152) (actual time=176.448..176.450 rows=1 loops=1)
               ->  Result  (cost=0.00..0.01 rows=1 width=152) (actual time=0.002..0.003 rows=1 loops=1)
 Planning Time: 0.144 ms
 Execution Time: 176.583 ms
(9 rows)

tsdb=> explain analyze insert into testdata (timestamp, idSite, code, instance, valuefloat) values (1689336100, 139400, 'V', 0, 3.3) on conflict do nothing;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Custom Scan (HypertableModify)  (cost=0.00..0.01 rows=1 width=152) (actual time=165.338..165.339 rows=0 loops=1)
   ->  Insert on testdata  (cost=0.00..0.01 rows=1 width=152) (actual time=165.338..165.339 rows=0 loops=1)
         Conflict Resolution: NOTHING
         Tuples Inserted: 1
         Conflicting Tuples: 0
         ->  Custom Scan (ChunkDispatch)  (cost=0.00..0.01 rows=1 width=152) (actual time=165.298..165.300 rows=1 loops=1)
               ->  Result  (cost=0.00..0.01 rows=1 width=152) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.127 ms
 Execution Time: 165.434 ms
(9 rows)



### How can we reproduce the bug?

```bash
drop database db;
create database db;
\c db
create extension timescaledb;
create table testdata
(
  timestamp bigint not null,
  idSite integer null,
  code text not null,
  instance smallint not null,
  venusVersion text,
  productId text,
  valueFloat float,
  valueString text,
  valueEnum smallint
);

CREATE OR REPLACE FUNCTION unix_now() returns BIGINT LANGUAGE SQL STABLE as $$ SELECT extract(epoch from now())::BIGINT $$;
select create_hypertable('testdata', 'timestamp', chunk_time_interval=>3600, create_default_indexes=>False);
select set_integer_now_func('testdata', 'unix_now');

create unique index uniqueindex on testdata (idSite, code, instance, timestamp desc);
alter table testdata set (timescaledb.compress, timescaledb.compress_segmentby = 'idSite,code,instance', timescaledb.compress_orderby='timestamp desc');
select add_compression_policy('testdata', 14400);
copy testdata(timestamp,idsite,code,instance,venusversion,productid,valuefloat,valuestring,valueenum) FROM '/Users/bharathysatish/Downloads/testdata.csv' DELIMITER ',' CSV HEADER;
call run_job (1000);
EXPLAIN (ANALYZE, BUFFERS, SETTINGS) insert into testdata (timestamp, idSite, code, instance, valuefloat) values (1689336100, 139400, 'V', 0, 3.3) on conflict do nothing;

[testdata.csv.zip](https://github.com/timescale/timescaledb/files/12584252/testdata.csv.zip)

The text was updated successfully, but these errors were encountered:

The INSERT query with ON CONFLICT on compressed chunk does a heapscan to verify unique constraint violation. This patch improves the performance by doing an indexscan on compressed chunk, to fetch matching records based on segmentby columns. These matching records are inserted into uncompressed chunk, then unique constraint violation is verified. Since index on compressed chunk has only segmentby columns, we cannot do a point lookup considering orderby columns as well. This patch thus will result in decompressing matching records only based on segmentby columns. Fixes #6063

sb230132 · 2023-09-12T08:49:36Z

With patch from #6061 INSERTS are much faster.

db=# begin; EXPLAIN (ANALYZE, BUFFERS, SETTINGS) insert into testdata (timestamp, idSite, code, instance, valuefloat) values (1689336100, 139400, 'V', 0, 3.3) on conflict do nothing;rollback;
BEGIN
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Custom Scan (HypertableModify)  (cost=0.00..0.01 rows=1 width=152) (actual time=1.228..1.231 rows=0 loops=1)
   Buffers: shared hit=56
   ->  Insert on testdata  (cost=0.00..0.01 rows=1 width=152) (actual time=1.228..1.231 rows=0 loops=1)
         Conflict Resolution: NOTHING
         Tuples Inserted: 1
         Conflicting Tuples: 0
         Buffers: shared hit=56
         ->  Custom Scan (ChunkDispatch)  (cost=0.00..0.01 rows=1 width=152) (actual time=1.160..1.163 rows=1 loops=1)
               Buffers: shared hit=51
               ->  Result  (cost=0.00..0.01 rows=1 width=152) (actual time=0.008..0.008 rows=1 loops=1)
 Planning:
   Buffers: shared hit=13
 Planning Time: 0.399 ms
 Execution Time: 1.547 ms
(14 rows)

ROLLBACK

The INSERT query with ON CONFLICT on compressed chunk does a heapscan to verify unique constraint violation. This patch improves the performance by doing an indexscan on compressed chunk, to fetch matching records based on segmentby columns. These matching records are inserted into uncompressed chunk, then unique constraint violation is verified. Since index on compressed chunk has only segmentby columns, we cannot do a point lookup considering orderby columns as well. This patch thus will result in decompressing matching records only based on segmentby columns. Fixes #6063 Fixes #5801

corkscruff · 2023-09-13T10:38:46Z

Tagging @wiebeytec for awareness

antekresic · 2024-12-09T12:02:07Z

This was optimized with the recent releases.

sb230132 added the bug label Sep 12, 2023

sb230132 self-assigned this Sep 12, 2023

sb230132 mentioned this issue Sep 12, 2023

Fix INSERT query performance on compressed chunk #6061

Closed

sb230132 mentioned this issue Sep 12, 2023

[Bug]: Bad performance / Very high CPU load when using INSERT DO UPDATE ON CONFLICT on compressed chunks #5801

Open

mkindahl unassigned sb230132 Oct 24, 2023

antekresic closed this as completed Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow insert on compressed chunks #6063

Slow insert on compressed chunks #6063

sb230132 commented Sep 12, 2023 •

edited

Loading

sb230132 commented Sep 12, 2023

corkscruff commented Sep 13, 2023

antekresic commented Dec 9, 2024

Slow insert on compressed chunks #6063

Slow insert on compressed chunks #6063

Comments

sb230132 commented Sep 12, 2023 • edited Loading

What type of bug is this?

What subsystems and features are affected?

What happened?

TimescaleDB version affected

PostgreSQL version used

What operating system did you use?

What installation method did you use?

What platform did you run on?

Relevant log output and stack trace

sb230132 commented Sep 12, 2023

corkscruff commented Sep 13, 2023

antekresic commented Dec 9, 2024

sb230132 commented Sep 12, 2023 •

edited

Loading