-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a deadlock when decompressing chunks and performing SELECTs #4676
Fix a deadlock when decompressing chunks and performing SELECTs #4676
Conversation
a7e6451
to
e0db89d
Compare
Codecov Report
@@ Coverage Diff @@
## main #4676 +/- ##
==========================================
- Coverage 90.92% 90.89% -0.03%
==========================================
Files 224 224
Lines 42406 42407 +1
==========================================
- Hits 38556 38545 -11
- Misses 3850 3862 +12
Continue to review full report at Codecov.
|
/* Prevent readers from using the compressed chunk that is going to be deleted */ | ||
LockRelationOid(uncompressed_chunk->table_id, AccessExclusiveLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, so it was actually a typo -- I never noticed the discrepancy with the comment.
tsl/src/compression/api.c
Outdated
* Prevents readers from using the compressed chunk that is going to be | ||
* deleted. Calling performMultipleDeletions in chunk_index_tuple_delete | ||
* also requests an AccessExclusiveLock. However, this call makes the | ||
* lock on the chunk explicit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change not only makes the lock explicit, but also moves it earlier, before ts_compression_chunk_size_delete
. Do we have a test case that breaks if we remove the explicit lock altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the first glance, nothing should break because w/o the lock the concurrent SELECTs just stop seeing the compressed chunk during planning, earlier than it is actually dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not able to create a failing test for this case. However, I am in favor of making the lock explicit because also PostgreSQL requests an AccessExclusiveLock
before performMultipleDeletions
is invoked. So, I want to be compliant with PostgreSQL logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is not only about explicitness, this is changing the relative order of events -- e.g. whether we modify the chunks table holding the exclusive lock on compressed chunk or not. If you want just to make it analogous to posgres, the place to take the deletion lock would be just before the ts_chunk_drop
call, or maybe inside the ts_chunk_drop_internal
call, or just relying on the performDelete
doing the same thing.
If we explicitly want to lock it before modifying the catalog, I think we need some justification for this. I mean, if we have another problem in this place, someone will look at this lock again and think: why is it here and not there?
Given that the catalog modification and dropping the chunk happen inside a transaction, they are both going to become visible atomically, anyway, no matter where we put this lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AKZUM Moving the lock just before ts_chunk_drop
is a good point to ensure the order of events is not changed.
Are you generally in favor of performing this lock or do you argue that it is not needed because we can rely on performDelete
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the order of events is not changed
Of which events exactly?
I think the locking performed by performDelete
is enough for correctness. The reason I'm asking is that I don't understand the rationale for 1) taking this lock explicitly (well, maybe documentation and following the general postgres line of thought)
2) most importantly, the rationale for doing it at this particular line and not in ts_chunk_drop_internal
like point (1) would suggest. If we are trying to prevent some unwanted sequence of events, what is this sequence exactly? I think it's important to have such a description in the future when we are going to modify this code, so that we are able to do it correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of which events exactly?
I mean, we probably should have a comment describing what the order is, and why it is important. At the first glance, there's no meaningful ordering of cleaning up the chunk record, the chunk size record and dropping the compressed relation -- to other transactions they happen atomically at the end of decompressing transaction.
41e38ed
to
e99854c
Compare
* (as done in PostgreSQL when tables are dropped, | ||
* see RemoveRelations). | ||
*/ | ||
LockRelationOid(compressed_chunk->table_id, AccessExclusiveLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you elaborate why this is needed ? doesn't ts_chunk_drop (or a function called by it )acquire the AccessExclusiveLock before dropping the table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that no AccessExclusiveLock
on the compressed chunk is requested in our existing code before it is deleted.
I want to be compliant with the way PostgreSQL is performing locks before tables are deleted and performMultipleDeletions
is called. Maybe I am too overcautious at this point and we can trust the internals of performMultipleDeletions
. Then the lock would be superfluous.
Attached are the locks that are active before performMultipleDeletions
is called in ts_chunk_index_delete_by_chunk_id
and the AccessExclusiveLock
is implicitly created on the chunk (_timescaledb_internal.compress_hyper_2_7296_chunk
is uncompressed in this example).
SELECT relation::regclass as name, locktype, database, relation, pid, mode, granted, fastpath, waitstart FROM pg_locks WHERE relation::regclass::text LIKE '%chunk' ORDER BY relation, locktype, mode, granted;
name | locktype | database | relation | pid | mode | granted | fastpath | waitstart
---------------------------------------------------+----------+----------+----------+---------+------------------+---------+----------+-----------
_timescaledb_catalog.chunk | relation | 42698 | 42810 | 1113329 | RowExclusiveLock | t | t |
_timescaledb_internal._hyper_1_1_chunk | relation | 42698 | 43372 | 1113329 | ExclusiveLock | t | f |
_timescaledb_internal._hyper_1_1_chunk | relation | 42698 | 43372 | 1113329 | ShareLock | t | f |
_timescaledb_internal.compress_hyper_2_7296_chunk | relation | 42698 | 83658 | 1113329 | ExclusiveLock | t | f |
(4 rows)
timescaledb-2.9.0-dev.so!chunk_index_tuple_delete(TupleInfo * ti, void * data) (/home/jan/timescaledb/src/chunk_index.c:669)
timescaledb-2.9.0-dev.so!ts_scanner_scan(ScannerCtx * ctx) (/home/jan/timescaledb/src/scanner.c:451)
timescaledb-2.9.0-dev.so!chunk_index_scan(int indexid, ScanKeyData * scankey, int nkeys, tuple_found_func tuple_found, tuple_filter_func tuple_filter, void * data, LOCKMODE lockmode) (/home/jan/timescaledb/src/chunk_index.c:502)
timescaledb-2.9.0-dev.so!ts_chunk_index_delete_by_chunk_id(int32 chunk_id, _Bool drop_index) (/home/jan/timescaledb/src/chunk_index.c:779)
timescaledb-2.9.0-dev.so!chunk_tuple_delete(TupleInfo * ti, DropBehavior behavior, _Bool preserve_chunk_catalog_row) (/home/jan/timescaledb/src/chunk.c:2879)
timescaledb-2.9.0-dev.so!chunk_delete(ScanIterator * iterator, DropBehavior behavior, _Bool preserve_chunk_catalog_row) (/home/jan/timescaledb/src/chunk.c:2954)
timescaledb-2.9.0-dev.so!ts_chunk_delete_by_name_internal(const char * schema, const char * table, DropBehavior behavior, _Bool preserve_chunk_catalog_row) (/home/jan/timescaledb/src/chunk.c:2981)
timescaledb-2.9.0-dev.so!ts_chunk_delete_by_relid(Oid relid, DropBehavior behavior, _Bool preserve_chunk_catalog_row) (/home/jan/timescaledb/src/chunk.c:3002)
timescaledb-2.9.0-dev.so!ts_chunk_drop_internal(const Chunk * chunk, DropBehavior behavior, int32 log_level, _Bool preserve_catalog_row) (/home/jan/timescaledb/src/chunk.c:3669)
timescaledb-2.9.0-dev.so!ts_chunk_drop(const Chunk * chunk, DropBehavior behavior, int32 log_level) (/home/jan/timescaledb/src/chunk.c:3678)
timescaledb-tsl-2.9.0-dev.so!decompress_chunk_impl(Oid uncompressed_hypertable_relid, Oid uncompressed_chunk_relid, _Bool if_compressed) (/home/jan/timescaledb/tsl/src/compression/api.c:413)
timescaledb-tsl-2.9.0-dev.so!tsl_decompress_chunk(FunctionCallInfo fcinfo) (/home/jan/timescaledb/tsl/src/compression/api.c:645)
timescaledb-2.9.0-dev.so!ts_decompress_chunk(FunctionCallInfo fcinfo) (/home/jan/timescaledb/src/cross_module_fn.c:86)
ExecInterpExpr(ExprState * state, ExprContext * econtext, _Bool * isnull) (/home/jan/postgresql-sandbox/src/REL_14_2/src/backend/executor/execExprInterp.c:749)
ExecInterpExprStillValid(ExprState * state, ExprContext * econtext, _Bool * isNull) (/home/jan/postgresql-sandbox/src/REL_14_2/src/backend/executor/execExprInterp.c:1824)
ExecEvalExprSwitchContext(ExprState * state, ExprContext * econtext, _Bool * isNull) (/home/jan/postgresql-sandbox/src/REL_14_2/src/include/executor/executor.h:339)
ExecProject(ProjectionInfo * projInfo) (/home/jan/postgresql-sandbox/src/REL_14_2/src/include/executor/executor.h:373)
ExecScan(ScanState * node, ExecScanAccessMtd accessMtd, ExecScanRecheckMtd recheckMtd) (/home/jan/postgresql-sandbox/src/REL_14_2/src/backend/executor/execScan.c:238)
ExecFunctionScan(PlanState * pstate) (/home/jan/postgresql-sandbox/src/REL_14_2/src/backend/executor/nodeFunctionscan.c:270)
ExecProcNodeFirst(PlanState * node) (/home/jan/postgresql-sandbox/src/REL_14_2/src/backend/executor/execProcnode.c:463)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkindahl Thank you for creating the drawing. It illustrates the problem well. Before I write a longer reply, I would like to make sure that I understand your concern correctly.
- Are you concerned that the
AccessExclusiveLock
should be taken later in the code so that readers can access the chunk as long as possible or - are you concerned about this situation in general (i.e., a reader that waits on a
AccessShareLock
of a chunk and the chunk no longer exists after the lock is granted)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about the latter. In this situation, the reader should not even reach this point in the code.
If a decompression is ongoing, the compressed chunk will be decompressed and removed and the reader should be re-routed to the uncompressed chunk where the data will reside once the decompression is one. Then it can start reading from the chunk once the lock is granted. With this locking pattern, readers will experience errors if they are unlucky and race with the decompression job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkindahl In the most common code path (when using a hypertable with at least one index), the readers are already blocked by the AccessExclusiveLock
on the index caused by the reindex_relation
call after decompressing the chunk. However, it's a good point to also consider hypertables without any index and grab the lock after the compressed chunk is removed from the catalog to route such reads properly. I changed the PR accordingly.
Are you generally in favor of requesting an explicit AccessExclusiveLock
on the chunk before we delete it? I am not sure if this is really necessary. I introduced it because PostgreSQL explicitly requests an AccessExclusiveLock
before a table is dropped and performMultipleDeletions
is invoked, and I want to be consistent with the way PostgreSQL implements similar functionality. However, the preconditions for calling the performMultipleDeletions
function don't seem to be explicitly defined. I am not sure what is the correct/best solution here. Maybe you have some advice for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you generally in favor of requesting an explicit
AccessExclusiveLock
on the chunk before we delete it? I am not sure if this is really necessary. I introduced it because PostgreSQL explicitly requests anAccessExclusiveLock
before a table is dropped andperformMultipleDeletions
is invoked, and I want to be consistent with the way PostgreSQL implements similar functionality. However, the preconditions for calling theperformMultipleDeletions
function don't seem to be explicitly defined. I am not sure what is the correct/best solution here. Maybe you have some advice for me.
If that is a precondition for calling performMultipleDeletions
then we should do it, and in general you need to lock the table that you're deleting, but this is not the same as the problem described above.
My comment above was more a result from the fact that you are trying to solve a deadlock and in the comment say that you're locking here to prevent readers from reading the chunk. If you change this and readers do not normally reach this path, you need to update the comment so that it is accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkindahl Indeed, the comment was outdated and misleading. I have updated it in the current version of the PR.
The deadlock was introduced in a608d7d by requesting a lock for the uncompressed chunk instead of the compressed one (uncompressed_chunk->table_id
instead of compressed_chunk->table_id
). This is changed in this PR and solves the deadlock.
The PostgreSQL documentation does not mention whether such an AccessExclusiveLock
is a prerequisite for calling performMultipleDeletions
. But PostgreSQL explicitly requests such a lock before calling the function. I am therefore also in favor of explicitly requesting this lock.
a27c629
to
3490acb
Compare
3490acb
to
cd5f25d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fix looks good, but our locking here is too complicated for our own good.
-- All generated data is part of one chunk. Only one chunk is used because 'compress_chunk' is | ||
-- used in this isolation test. In contrast to 'policy_compression_execute' all decompression | ||
-- operations are executed in one transaction. So, processing more than one chunk with 'compress_chunk' | ||
-- could lead to deadlocks that are not occur real-world scenarios (due to locks hold on a completely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- could lead to deadlocks that are not occur real-world scenarios (due to locks hold on a completely | |
-- could lead to deadlocks that do not occur real-world scenarios (due to locks hold on a completely |
INSERT INTO sensor_data | ||
SELECT | ||
time + (INTERVAL '1 minute' * random()) AS time, | ||
sensor_id, | ||
random() AS cpu, | ||
random()* 100 AS temperature | ||
FROM | ||
generate_series('2022-01-01', '2022-01-15', INTERVAL '1 minute') AS g1(time), | ||
generate_series(1, 50, 1) AS g2(sensor_id) | ||
ORDER BY time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: a little hard to read.
INSERT INTO sensor_data | |
SELECT | |
time + (INTERVAL '1 minute' * random()) AS time, | |
sensor_id, | |
random() AS cpu, | |
random()* 100 AS temperature | |
FROM | |
generate_series('2022-01-01', '2022-01-15', INTERVAL '1 minute') AS g1(time), | |
generate_series(1, 50, 1) AS g2(sensor_id) | |
ORDER BY time; | |
INSERT INTO sensor_data | |
SELECT "time" + (INTERVAL '1 minute' * random()) AS "time", | |
sensor_id, | |
random() AS cpu, | |
random()* 100 AS temperature | |
FROM generate_series('2022-01-01', '2022-01-15', INTERVAL '1 minute') AS g1("time"), | |
generate_series(1, 50, 1) AS g2(sensor_id) | |
ORDER BY "time";``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I addressed your comments in the current version of the PR.
This patch fixes a deadlock between chunk decompression and SELECT queries executed in parallel. The change in a608d7d requests an AccessExclusiveLock for the decompressed chunk instead of the compressed chunk, resulting in deadlocks. In addition, an isolation test has been added to test that SELECT queries on a chunk that is currently decompressed can be executed. Fixes timescale#4605
cd5f25d
to
53aed0d
Compare
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4454 Keep locks after reading job status * timescale#4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * timescale#4671 Fix a possible error while flushing the COPY data * timescale#4675 Fix bad TupleTableSlot drop * timescale#4676 Fix a deadlock when decompressing chunks and performing SELECTs * timescale#4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * timescale#4694 Change parameter names of cagg_migrate procedure * timescale#4698 Do not use row-by-row fetcher for parameterized plans * timescale#4711 Remove support for procedures as custom checks * timescale#4712 Fix assertion failure in constify_now * timescale#4713 Fix Continuous Aggregate migration policies * timescale#4720 Fix chunk exclusion for prepared statements and dst changes * timescale#4726 Fix gapfill function signature * timescale#4737 Fix join on time column of compressed chunk * timescale#4738 Fix error when waiting for remote COPY to finish * timescale#4739 Fix continuous aggregate migrate check constraint * timescale#4760 Fix segfault when INNER JOINing hypertables * timescale#4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #4454 Keep locks after reading job status * #4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * #4671 Fix a possible error while flushing the COPY data * #4675 Fix bad TupleTableSlot drop * #4676 Fix a deadlock when decompressing chunks and performing SELECTs * #4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * #4694 Change parameter names of cagg_migrate procedure * #4698 Do not use row-by-row fetcher for parameterized plans * #4711 Remove support for procedures as custom checks * #4712 Fix assertion failure in constify_now * #4713 Fix Continuous Aggregate migration policies * #4720 Fix chunk exclusion for prepared statements and dst changes * #4726 Fix gapfill function signature * #4737 Fix join on time column of compressed chunk * #4738 Fix error when waiting for remote COPY to finish * #4739 Fix continuous aggregate migrate check constraint * #4760 Fix segfault when INNER JOINing hypertables * #4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #4454 Keep locks after reading job status * #4658 Fix error when querying a compressed hypertable with compress_segmentby on an enum column * #4671 Fix a possible error while flushing the COPY data * #4675 Fix bad TupleTableSlot drop * #4676 Fix a deadlock when decompressing chunks and performing SELECTs * #4685 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries * #4694 Change parameter names of cagg_migrate procedure * #4698 Do not use row-by-row fetcher for parameterized plans * #4711 Remove support for procedures as custom checks * #4712 Fix assertion failure in constify_now * #4713 Fix Continuous Aggregate migration policies * #4720 Fix chunk exclusion for prepared statements and dst changes * #4726 Fix gapfill function signature * #4737 Fix join on time column of compressed chunk * #4738 Fix error when waiting for remote COPY to finish * #4739 Fix continuous aggregate migrate check constraint * #4760 Fix segfault when INNER JOINing hypertables * #4767 Fix permission issues on index creation for CAggs **Thanks** * @boxhock and @cocowalla for reporting a segfault when JOINing hypertables * @carobme for reporting constraint error during continuous aggregate migration * @choisnetm, @dustinsorensen, @jayadevanm and @joeyberkovitz for reporting a problem with JOINs on compressed hypertables * @daniel-k for reporting a background worker crash * @justinpryzby for reporting an error when compressing very wide tables * @maxtwardowski for reporting problems with chunk exclusion and space partitions * @yuezhihan for reporting GROUP BY error when having compress_segmentby on an enum column
This patch fixes a deadlock between chunk decompression and SELECT queries
executed in parallel. The change in
a608d7d requests an AccessExclusiveLock
for the decompressed chunk instead of the compressed chunk, resulting in
deadlocks.
In addition, an isolation test has been added to test that SELECT
queries on a chunk that is currently decompressed can be executed.
Fixes #4605
Fixes #2565