Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: compress chunks in the same hypertable in parallel #6239

Closed
Tindarid opened this issue Oct 27, 2023 · 2 comments
Closed

[Enhancement]: compress chunks in the same hypertable in parallel #6239

Tindarid opened this issue Oct 27, 2023 · 2 comments
Labels
compression enhancement An enhancement to an existing feature for functionality

Comments

@Tindarid
Copy link

Tindarid commented Oct 27, 2023

What type of enhancement is this?

Performance

What subsystems and features will be improved?

Compression

What does the enhancement do?

Hi!

I am trying to compress several chunks in parallel for the same hypertable (for different hyperatbles it works fine).

From my understanding, during the compression we should have exclusive lock on a new compressed chunk and read from the raw chunk (truncate the raw chunk in the end). Also, updating things on a main table should require locks for the whole table, but for a short period.

When I started compressing 2 chunks in parallel, I noticed that the first chunk compression blocks the second one:
ShareUpdateExclusiveLock on _timescaldb_internal.compressed_hypertable*

While digging into the code, I think this piece of logic takes this lock:
https://github.com/timescale/timescaledb/blob/main/src/chunk.c#L1145
So, Serialize chunk creation around a lock on the "main table" to avoid multiple processes trying to create the same chunk

I think this kind of situation could be avoided differently and should not require lock on the main compressed hypertable (which is blocking another chunk compression, as it takes the same ShareUpdateExclusiveLock)

What I mean here: let's suppose, that when we create a chunk - we also preallocate compressed chunk name for this chunk. During the compression we lock this compressed chunk and do not interfere with operations on the main hypertable

P.S.

  1. I noticed that decompression in parallel works fine
  2. I think this enhancement could be a workaround for this issue Parallel Compression threads for high throughput data #3077
  3. compressed_hypertable_*: what is this used for? I cannot find any documentation

Implementation challenges

No response

@Tindarid Tindarid added the enhancement An enhancement to an existing feature for functionality label Oct 27, 2023
@Tindarid
Copy link
Author

Also: recompression works in parallel quite well, because it doesn't create the new chunk

recompression/decompression can interfere with each other (consuming resources of the system, disk can be a bottleneck).

But if there are enough resources, because of the non-blocking nature - there is a significant speedup for bulk operation, which involves several chunks

@antekresic
Copy link
Contributor

Fixed by #6450

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compression enhancement An enhancement to an existing feature for functionality
Projects
None yet
Development

No branches or pull requests

3 participants