[Bug]: Slow copy on instance containing many tables #7780

anayrat · 2025-02-27T10:27:53Z

What type of bug is this?

Performance issue

What subsystems and features are affected?

Data ingestion

What happened?

Hello,
I am migrating a big timescaledb instance (200 000 tables, 5 millions chunks) from postgres 14 to postgres 16.

tables are relatively simple, my migration is also quite simple. I wrote a tool to :

COPY to a compressed file
Move compressed file to another server
create hypertable
COPY from this compressed file
compress hypertable
repeat the operation, after 100 tables, close the backend and open a new one to clean syscache

time_interval is sometimes 30 days or 1 year.

The server is well tuned (hugepages, shared_buffers at 64GB)...

Everything went well, with parallelization, I was able to saturate disk at around 2GB/s on destination server.

Unfortunately, after around 110 000 tables, the throughput went from 50 tables/s to one table every 5 seconds.

Even one single COPY became very slow, arround one tuple/s. There is no wait_event on the backend, I only see a COPY using one CPU at 100%

I perform perf top on the process:

  Children      Self  Shared Object                                                   Symbol                                    
+   27.26%    27.17%  postgres                                                        [.] hash_search_with_hash_value           
+   26.06%    25.92%  postgres                                                        [.] nocachegetattr                        
+   13.19%    13.19%  postgres                                                        [.] 0x000000000016363d                    
+   10.84%    10.76%  postgres

It seems the backend is looking for attribute on every tuple.
I tried to get several backtrace thanks to gdb :

#0  0x000000000052a520 in nocachegetattr ()
#1  0x0000000000564ab4 in heapgettup_pagemode ()
#2  0x0000000000565104 in heap_getnextslot ()
#3  0x0000000000583a0e in systable_getnext ()
#4  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#5  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#6  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#7  0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so

#0  0x0000000000a422d0 in hash_search_with_hash_value ()
#1  0x00000000008930dc in ReadBuffer_common.lto_priv.0 ()
#2  0x0000000000893528 in ReadBufferExtended ()
#3  0x0000000000563ea5 in heapgetpage ()
#4  0x0000000000564c24 in heapgettup_pagemode ()
#5  0x0000000000565104 in heap_getnextslot ()
#6  0x0000000000583a0e in systable_getnext ()
#7  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#11 0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#12 0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#13 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#14 0x00000000008d047a in PortalRunUtility ()

TimescaleDB version affected

2.17.1

PostgreSQL version used

16.7

What operating system did you use?

redhat 9.5

What installation method did you use?

Source

What platform did you run on?

Amazon Web Services (AWS)

Relevant log output and stack trace

#0  0x000000000052a520 in nocachegetattr ()
#1  0x0000000000564ab4 in heapgettup_pagemode ()
#2  0x0000000000565104 in heap_getnextslot ()
#3  0x0000000000583a0e in systable_getnext ()
#4  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#5  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#6  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#7  0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so

How can we reproduce the bug?

Create a cluster with more than 150 000 tables and around 20 chunk per hypertable.

The text was updated successfully, but these errors were encountered:

anayrat · 2025-02-27T10:39:20Z

To be more precise, tables are very simple : 3 columns col1, ts, value. 2 indexes
There is no FK.

anayrat added the bug label Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Slow copy on instance containing many tables #7780

[Bug]: Slow copy on instance containing many tables #7780

anayrat commented Feb 27, 2025

anayrat commented Feb 27, 2025

[Bug]: Slow copy on instance containing many tables #7780

[Bug]: Slow copy on instance containing many tables #7780

Comments

anayrat commented Feb 27, 2025

What type of bug is this?

What subsystems and features are affected?

What happened?

TimescaleDB version affected

PostgreSQL version used

What operating system did you use?

What installation method did you use?

What platform did you run on?

Relevant log output and stack trace

How can we reproduce the bug?

anayrat commented Feb 27, 2025