-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support declarative partitioning method #1154
Comments
Hi @bocaph. TimescaleDB is not using trigger-based partitioning. It is actually using a custom highly-optimized insert plan, similar to declarative partitioning. The slight performance advantage you see for PG11 is more or less expected because with declarative partitioning you are pre-creating all subtables (partitions) before you insert data. In contrast, TimescaleDB creates its "chunk" subtables on-demand, as they are needed, giving a little bit of overhead during inserts. The upside is that you need not manually (pre-)partition your table. You'll see a performance advantage with TimescaleDB once you have a lot of subtables and 10 child tables is not enough. Please see this blog post for further explanations on differences, including benchmarks: https://blog.timescale.com/time-series-data-postgresql-10-vs-timescaledb-816ee808bac5/ |
+1 for this - currently it's quite a limitation that once your chunk count grows larger than +- 500, planning time increases to >500ms for even the most simple SELECT queries, even if they use chunk exclusion. In Postgres 12 (current master branch) planning time with 500 partitions is less than 1ms (for static chunk exclusion). Furthermore, partitioned tables have run-time pruning which speeds up queries with dynamic values drastically. Both of these will not be available for inheritance-based options. |
@fvannee Believe me, if it was easy to adopt partitioned tables we'd do it in a hart beat. Still, due to technical differences in how we deal with partitioning, it is not going to be an easy thing, if at all possible without major re-engineering. Those are also going to be breaking changes, so we won't be able to support existing TimescaleDB databases. I can't comment on changes in PG12 that might have removed some of the big blockers, like opening up all subrelations during INSERTs. Sure, if you insert a lot of data in a single transaction, you effectively amortize the cost of opening 6000 tables even if you don't insert into all of them. But, if you often insert batches into the "last" partition in a time-partitioned table, it probably doesn't make sense to also open the 5999 other subtables to do this insert. The only thing I can say in conclusion is that we are constantly on the lookout for how things develop with partitioned tables, and any decision on adopting partitioned tables depends on whether the pros outweigh the cons. |
Please don't get me wrong - I know this would be a change that's definitely not easy. It would both require effort to get it working in the first place, and migrating from previous versions is going to be another issue altogether. So for now it's a valid reason that it's just too much effort to implement. However, I think the reasons laid out in the blog post are (or will soon be) mostly outdated. |
Thanks, @erimatnor, @fvannee for your comments. I also already read the blog before creating this topic. just for reference here is the result of my test in pg 11.2 timescaleDB 1.2.2EXPLAIN (ANALYZE, BUFFERS) INSERT INTO conditions_timesc SELECT * FROM conditions;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Custom Scan (HypertableInsert) (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=3000785.069..3000785.069 rows=0 loops=1)
Buffers: shared hit=1851493119 read=19929798 dirtied=16015270 written=14424221
-> Insert on conditions_timesc (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=3000785.068..3000785.068 rows=0 loops=1)
Buffers: shared hit=1851493119 read=19929798 dirtied=16015270 written=14424221
-> Custom Scan (ChunkDispatch) (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=15.050..245411.987 rows=414720016 loops=1)
Buffers: shared hit=7092 read=3875550 dirtied=98 written=15
-> Seq Scan on conditions (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=0.019..112648.679 rows=414720016 loops=1)
Buffers: shared read=3875373
Planning Time: 0.575 ms
Execution Time: 3000785.264 ms
(10 rows)
Time: 3000812.120 ms (50:00.812)
Table "public.conditions_timesc"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
location | text | | not null | | extended | |
temperature | double precision | | | | plain | |
humidity | double precision | | | | plain | |
id | integer | | | | plain | |
Indexes:
"conditions_timesc_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON conditions_timesc FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_1_44_chunk,
_timescaledb_internal._hyper_1_45_chunk,
_timescaledb_internal._hyper_1_46_chunk,
_timescaledb_internal._hyper_1_47_chunk,
_timescaledb_internal._hyper_1_48_chunk,
_timescaledb_internal._hyper_1_49_chunk,
_timescaledb_internal._hyper_1_50_chunk,
_timescaledb_internal._hyper_1_51_chunk,
_timescaledb_internal._hyper_1_52_chunk,
_timescaledb_internal._hyper_1_53_chunk,
_timescaledb_internal._hyper_1_54_chunk PostgreSQL 11.2 declarative partitioningEXPLAIN (ANALYZE, BUFFERS) INSERT INTO conditions_declare_par SELECT * FROM conditions;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Insert on conditions_declare_par (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=2846752.128..2846752.128 rows=0 loops=1)
Buffers: shared hit=1728205294 read=20406837 dirtied=16529816 written=15015313
-> Seq Scan on conditions (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=0.013..109683.598 rows=414720016 loops=1)
Buffers: shared hit=4 read=3875369
Planning Time: 0.419 ms
Execution Time: 2846752.375 ms
(6 rows)
Time: 2846757.200 ms (47:26.757)
Table "public.conditions_declare_par"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
location | text | | not null | | extended | |
temperature | double precision | | | | plain | |
humidity | double precision | | | | plain | |
id | integer | | | | plain | |
Partition key: RANGE ("time")
Indexes:
"time_par_idx" btree ("time")
Partitions: conditions_declare_par_child01 FOR VALUES FROM ('2019-04-09 00:00:00+09') TO ('2019-05-09 00:00:00+09'),
conditions_declare_par_child02 FOR VALUES FROM ('2019-05-09 00:00:00+09') TO ('2019-06-08 00:00:00+09'),
conditions_declare_par_child03 FOR VALUES FROM ('2019-06-08 00:00:00+09') TO ('2019-07-08 00:00:00+09'),
conditions_declare_par_child04 FOR VALUES FROM ('2019-07-08 00:00:00+09') TO ('2019-08-07 00:00:00+09'),
conditions_declare_par_child05 FOR VALUES FROM ('2019-08-07 00:00:00+09') TO ('2019-09-06 00:00:00+09'),
conditions_declare_par_child06 FOR VALUES FROM ('2019-09-06 00:00:00+09') TO ('2019-10-06 00:00:00+09'),
conditions_declare_par_child07 FOR VALUES FROM ('2019-10-06 00:00:00+09') TO ('2019-11-05 00:00:00+09'),
conditions_declare_par_child08 FOR VALUES FROM ('2019-11-05 00:00:00+09') TO ('2019-12-05 00:00:00+09'),
conditions_declare_par_child09 FOR VALUES FROM ('2019-12-05 00:00:00+09') TO ('2020-01-04 00:00:00+09'),
conditions_declare_par_child10 FOR VALUES FROM ('2020-01-04 00:00:00+09') TO ('2020-02-04 00:00:00+09') |
Hi @erimatnor, @fvannee Just a reference, here is the update for INSERT performance in pg 11 vs timescaleDB in my environment. btw, I am not seeing any breakings in pg12 (regarding INSERT performance). timescaleDB 1.2.2postgres=# explain (analyze, buffers) insert into conditions_timesc select * from conditions;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Custom Scan (HypertableInsert) (cost=0.00..8022580.17 rows=414720717 width=42) (actual time=2787997.225..2787997.225 rows=0 loops=1)
Buffers: shared hit=1276847924 read=20069055 dirtied=16138128 written=14669150
-> Insert on conditions_timesc (cost=0.00..8022580.17 rows=414720717 width=42) (actual time=2787997.224..2787997.224 rows=0 loops=1)
Buffers: shared hit=1276847924 read=20069055 dirtied=16138128 written=14669150
-> Custom Scan (ChunkDispatch) (cost=0.00..8022580.17 rows=414720717 width=42) (actual time=12.731..496809.537 rows=414720016 loops=1)
Buffers: shared hit=6976968 read=3881421 dirtied=3590 written=4054
-> Seq Scan on conditions (cost=0.00..8022580.17 rows=414720717 width=42) (actual time=0.017..183077.727 rows=414720016 loops=1)
Buffers: shared hit=6 read=3875367
Planning Time: 0.529 ms
Execution Time: 2787997.840 ms
(10 rows)
Time: 2788088.903 ms (46:28.089)
postgres=# \d conditions_timesc
Table "public.conditions_timesc"
Column | Type | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+---------
time | timestamp with time zone | | not null |
location | text | | not null |
temperature | double precision | | |
humidity | double precision | | |
Indexes:
"conditions_timesc_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON conditions_timesc FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
Number of child tables: 2401 (Use \d+ to list them.) PostgreSQL 11.2 declarative partitioning# $ time ./createtbl.sh > /dev/null
real 1m50.429s
user 0m4.925s
sys 0m9.136s
postgres=# explain (analyze, buffers) insert into conditions_declare_par select * from conditions;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Insert on conditions_declare_par (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=2783046.827..2783046.827 rows=0 loops=1)
Buffers: shared hit=1386838237 read=20581596 dirtied=16656432 written=15182389
-> Seq Scan on conditions (cost=0.00..8022580.17 rows=414720717 width=46) (actual time=0.024..183790.473 rows=414720016 loops=1)
Buffers: shared hit=14 read=3875359 written=9
Planning Time: 1.328 ms
Execution Time: 2783061.255 ms
(6 rows)
Time: 2783110.947 ms (46:23.111)
postgres=# \d conditions_declare_par
Table "public.conditions_declare_par"
Column | Type | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+---------
time | timestamp with time zone | | not null |
location | text | | not null |
temperature | double precision | | |
humidity | double precision | | |
Partition key: RANGE ("time")
Indexes:
"time_par_idx" btree ("time")
Number of partitions: 2412 (Use \d+ to list them.) Thanks, |
Thanks for these benchmarks @bocaph. Maybe I am misreading something in these numbers, but it looks like PG11.2 and TimescaleDB are on-par w.r.t. insert time (46:23 vs 46:28), which is to be expected (both are fundamentally PostgreSQL underneath and can't go faster than allowed by the underlying system). However, factor in the table creation time of 1:50, and TimescaleDB is faster. Furthermore, these are ideal test conditions for declarative partitioning because you are bulk loading all data in one transaction, allowing PG11 to amortize the cost of opening all 2412 tables over the entire data set. This is why you are not seeing the performance degradation shown in our benchmarks. However, if you'd insert the data in time order and with batches of, e.g., 1000 rows, which would better represent real-world workloads, you would likely see much worse performance for PG11.2 since for every batch it would have to open all 2412 subtables (unless something has changed recently). I'd suggest you'd use a benchmarker like TSBS to test something that more accurately resembles production workloads as opposed to bulk loading. Finally, w.r.t feature sets, I will note that TimescaleDB already supports fast partition pruning, Partition Pruning at Execution Time, and hash partitioning, and it did it long before PostgreSQL. We also do a number of other performance and planner optimizations that are not in upstream PostgreSQL. By switching to declarative partitioning, we'd risk loosing those optimizations since declarative partitioning is a much more restrictive and opinionated partitioning scheme compared to regular inheritance. That said, if there was a clear path towards adopting declarative partitioning in TimescaleDB, including many benefits, we'd certainly do it. We very much appreciate this feedback though, and in particular the efforts put into producing numbers. We welcome further input and will certainly weigh our options based on what the community is telling us. |
@fvannee I am not sure that blog post will be entirely outdated as soon as you think it will be. I might miss something, but, out of the issues raised in the blog post, the only changes in PostgreSQL's favor so far is the addition of hash partitioning and unique index support (both TimescaleDB features since a long time). Also note that the blog post does not make the case for not using declarative partitioning in TImescaleDB. It simply argues for the UX and performance advantages of TimescaleDB over using plain declarative partitioning for typical use cases. Thus, I'd say that that blog post would be valid even if we adopt declarative partitioning underneath the hood, since one of the main advantages of TimescaleDB is automation and various performance optimizations around time series. Still, those points aside, there remain major blockers raised in that blog post:
To round off, we are looking at ways in which we will be able to leverage some of the work and performance optimizations put into declarative partitioning, without outright adopting the full approach. See for instance this PR: #1134. More work coming. |
@erimatnor Exactly, TimescaleDb has a lot of advantages over declarative partitioning in automation and flexibility. I'm afraid I won't be able to give any useful comments on multi-dimensional partitioning, as I haven't used this myself. I do recognize this could be an important feature for other people though.
|
Thanks for these benchmarks @fvannee, some pretty awesome stuff there. We will definitely be on the lookout for improvements in PG12, and hopefully we'll be able to benefit from those as well. |
Thanks, @fvannee for the benchmarks. Thanks @erimatnor, I am very glad to hear it. |
@erimatnor Thanks for the great work on TimescaleDB. Would you be able to elaborate on case 4)? It seems strange to me that a regular query is fast, while the exact same query inlined from a function is slow. Because regular PG11 is slow in both cases, I assume that TimescaleDb optimizes this somewhere, but doesn't handle the case of an inlined function well? |
Nice work testing this and thanks for sharing the summary! |
still no plan to support it ? |
A simple POC revealed the following issues with using table partitioning.
Now, there might be ways to get around these issues, so the investigation continues. POC code: https://github.com/erimatnor/timescaledb/tree/hypertables-with-table-partitioning-poc |
Here's the patch for the issue mentioned above that prevents creating new partitions during inserts: postgres/postgres@dc3e436b191 This code in PostgreSQL is currently a major blocker for adopting declarative partitioning in TimescaleDB. But there are other reasons too why it might not be a good idea. |
Hi,
Thanks for working on the amazing timescaleDB project.
I am a newbie in timescaleDB, and this is just a confirmation but not an issue report.
Do you have a roadmap for switching trigger-based partitioning to declarative partitioning method?
We have some performance tests for timescaleDB vs PostgreSQL 11 Declarative Partitioning and see a little performance up in PostgreSQL 11 Declarative Partitioning (7% in bulk INSERT of 30GB of data for 10 child tables, using Dell storage).
The text was updated successfully, but these errors were encountered: