Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Slow copy on instance containing many tables #7780

Open
anayrat opened this issue Feb 27, 2025 · 1 comment
Open

[Bug]: Slow copy on instance containing many tables #7780

anayrat opened this issue Feb 27, 2025 · 1 comment
Labels

Comments

@anayrat
Copy link

anayrat commented Feb 27, 2025

What type of bug is this?

Performance issue

What subsystems and features are affected?

Data ingestion

What happened?

Hello,
I am migrating a big timescaledb instance (200 000 tables, 5 millions chunks) from postgres 14 to postgres 16.

tables are relatively simple, my migration is also quite simple. I wrote a tool to :

  • COPY to a compressed file
  • Move compressed file to another server
  • create hypertable
  • COPY from this compressed file
  • compress hypertable
  • repeat the operation, after 100 tables, close the backend and open a new one to clean syscache

time_interval is sometimes 30 days or 1 year.

The server is well tuned (hugepages, shared_buffers at 64GB)...

Everything went well, with parallelization, I was able to saturate disk at around 2GB/s on destination server.

Unfortunately, after around 110 000 tables, the throughput went from 50 tables/s to one table every 5 seconds.

Even one single COPY became very slow, arround one tuple/s. There is no wait_event on the backend, I only see a COPY using one CPU at 100%

I perform perf top on the process:

  Children      Self  Shared Object                                                   Symbol                                    
+   27.26%    27.17%  postgres                                                        [.] hash_search_with_hash_value           
+   26.06%    25.92%  postgres                                                        [.] nocachegetattr                        
+   13.19%    13.19%  postgres                                                        [.] 0x000000000016363d                    
+   10.84%    10.76%  postgres                                                                                                  

It seems the backend is looking for attribute on every tuple.
I tried to get several backtrace thanks to gdb :

#0  0x000000000052a520 in nocachegetattr ()
#1  0x0000000000564ab4 in heapgettup_pagemode ()
#2  0x0000000000565104 in heap_getnextslot ()
#3  0x0000000000583a0e in systable_getnext ()
#4  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#5  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#6  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#7  0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#0  0x0000000000a422d0 in hash_search_with_hash_value ()
#1  0x00000000008930dc in ReadBuffer_common.lto_priv.0 ()
#2  0x0000000000893528 in ReadBufferExtended ()
#3  0x0000000000563ea5 in heapgetpage ()
#4  0x0000000000564c24 in heapgettup_pagemode ()
#5  0x0000000000565104 in heap_getnextslot ()
#6  0x0000000000583a0e in systable_getnext ()
#7  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#11 0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#12 0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#13 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#14 0x00000000008d047a in PortalRunUtility ()

TimescaleDB version affected

2.17.1

PostgreSQL version used

16.7

What operating system did you use?

redhat 9.5

What installation method did you use?

Source

What platform did you run on?

Amazon Web Services (AWS)

Relevant log output and stack trace

#0  0x000000000052a520 in nocachegetattr ()
#1  0x0000000000564ab4 in heapgettup_pagemode ()
#2  0x0000000000565104 in heap_getnextslot ()
#3  0x0000000000583a0e in systable_getnext ()
#4  0x00007fd7bbf93b7b in ts_chunk_copy_referencing_fk () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#5  0x00007fd7bbf8f8a9 in chunk_create_from_hypercube_after_lock.lto_priv () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#6  0x00007fd7bbf9ad6f in ts_hypertable_create_chunk_for_point () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#7  0x00007fd7bbfc62b0 in ts_chunk_dispatch_get_chunk_insert_state () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#8  0x00007fd7bbfd1be6 in copyfrom.constprop () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#9  0x00007fd7bbf9fcf7 in process_copy () from /usr/pgsql-16/lib/timescaledb-2.17.1.so
#10 0x00007fd7bbfa57a9 in timescaledb_ddl_command_start () from /usr/pgsql-16/lib/timescaledb-2.17.1.so

How can we reproduce the bug?

Create a cluster with more than 150 000 tables and around 20 chunk per hypertable.
@anayrat anayrat added the bug label Feb 27, 2025
@anayrat
Copy link
Author

anayrat commented Feb 27, 2025

To be more precise, tables are very simple : 3 columns col1, ts, value. 2 indexes
There is no FK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant