Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gp_toolkit.__gp_aocsseg_history crash on non-aocs tables. #888

Merged
merged 16 commits into from
Jan 23, 2025

Conversation

avamingli
Copy link
Contributor

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


avamingli and others added 7 commits January 23, 2025 17:17
We try to use macro to fetch relation'name after closing the
relation in error message, which causes a crash.
We do have heap_close in later lines, don't need to close here.

Authored-by: Zhang Mingli [email protected]
GUC gp_max_partition_level sets the maximum number of levels allowed when creating
a partitioned table use the Greenplum classic syntax. Use 0 for no limit.
This was removed during 12 merge. Reintroduce it to main branch.
This only applies to Greenplum classic syntax, Greenplum modern syntax is not affected.
GPDB_12_MERGE_FIXME in ic_tcp.c:
should use WaitEventSetWait() instead of select() following the routine in ic_udpifc.c

The advantage of using WaitEventSetWait instead of select is that we can monitor
the specific waitevent in the pg_stat_activity view.

However, we still have many select() calls in current Greenplum codes, it's no need to
replace all of them with WaitEventSet, Because most of them doesn't benefit much
from pg_stat_activity, keep using select() is good and simple enough.
Therefore, we have decided to simply remove this fixme.
```
/* GPDB_12_MERGE_FIXME: would be nicer to store this hash in the DSM segment or DSA */
shareinput_Xslice_hash = ShmemInitHash("ShareInputScan notifications",
								  N_SHAREINPUT_SLOTS(),
								  N_SHAREINPUT_SLOTS(),
					      		          &info,
								  HASH_ELEM | HASH_BLOBS);
```
Shmem is static shared memory which is pre-allocated and initialized during the startup of the
postmaster. The address of static shared memory is the same between different backends.
It stores frequently accessed fixed-size data. Using static shared memory is much simpler.

DSM and DSA is dynamic shared memory which is allocated and managed dynamically
during the runtime of the postgres. The address between different backends is different. 
We usually use a static shared memory to store dsm_handle and other infos to help
backends to find its address. It is suitable for temporary data storage. 

shareinput_Xslice_state is small and we access the field frequently,
Instead of using the static memory to store shared_scan_id, dsm_handle and other infos,
store the shareinput_Xslice_state directly in static memory is more convenient.

So remove the fixme.
pg_base64_enc_len() and its clones overestimated the output
length by up to 2 bytes, as a result of sloppy thinking about
where to divide.  No callers require a precise estimate, so
this has no consequences worse than palloc'ing a byte or two
more than necessary.  We might as well get it right though.

This bug is very ancient, dating to commit 79d78bb which
added encode.c.  (The other instances were presumably copied
from there.)  Still, it doesn't quite seem worth back-patching.

Oleg Tselebrovskiy

Discussion: https://postgr.es/m/[email protected]
(cherry picked from commit d98ed080bb31fd3d46281127871b7886288686d9)

Co-authored-by: Tom Lane <[email protected]>
It is definitely supported and used by differential recovery in-tree.
The NOTICE was added when WAL replication was introduced into GPDB in
737e216e9e5f. We have come a long way since then.

Reviewed-by: Huansong Fu <[email protected]>
yanwr1 and others added 8 commits January 23, 2025 18:06
Move oneoffPlan check to function `BuildCachedPlan`, travel through
plan list once to get all oneoffPlan, transientPlan and dependsOnRole
value.

Remove GPDB_96_MERGE_FIXME in plancache.c.
In GPDB, we generate LockRows plan node only when Global Deadlock
Detector is enabled and the select statement with locking clause
contains only one table. In such case, we can sure that there are
no motions, don't bother to check gp_segment_id here.
In function `gp_toolkit.gp_move_orphaned_files`, the following
SQL will call `pg_file_rename` to move orphaned files:

```
WITH OrphanedFiles AS (
    xxxxx
    xxxxx
)
SELECT
   OrphanedFiles.gp_segment_id,
   OrphanedFiles.oldpath,
   OrphanedFiles.newpath,
   pg_file_rename(OrphanedFiles.oldpath, OrphanedFiles.newpath,
NULL) AS move_success
 FROM OrphanedFiles
```

However, `pg_file_rename` will only be called by coordinator, which
means, in a distributed environment, segments' files can't be moved
as expected.
By explicitly updating pg_statistic, there is the possibility that
autoanalyze analyzes the table between when it is set and when we check
for the expected result. In this case, the intent of the test is to
ensure gpsd dumps quotes. I've added a row with single quotes and
removed the explicit update of pg_statistic.
…y check

GDD and other services like GPCC may always keep the connections to
coordinator or segments. As discussed in PR#16903, we should skip
checking the idle session for orphaned files detection to avoid
that we can't get a detection result due to the ERROR `There is a
client session running on one or more segment.Aborting...`.
This commit is to fix issue #17180.

With this commit, we can provide more detailed information about logerrors in pg_exttable view in GP7. This is to keep consistent with older GP version.

```
testdb=# SELECT logerrors, options FROM pg_exttable WHERE reloid = 'public.test_ext'::regclass;
 logerrors |           options
-----------+-----------------------------
 t         | {error_log_persistent=true}
(1 row)
```
…errors (#17199)

This commit is to fix issue #17179.
External table location is stored as connected string with delimiter '|' In pg_foreign_table.ftoptions `location_uris`.
If location uri contains char '|', char '|' will be mistaken as delimiter '|' and cause errors. 

So in this commit, when location uri contains char '|', char '|' will be escaped by '\'.
And when read location uri from pg_foreign_table, location uri will be reverse escaped.
installcheck was always a no-op for pg_upgrade until PostgreSQL 15
switched tests to TAP.

Remove the fixme.
Authored-by: Zhang Mingli [email protected]
@reshke reshke merged commit 336b6bc into apache:main Jan 23, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.