-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNDB-12503 remove column name restrictions in SAI #1539
base: main
Are you sure you want to change the base?
Conversation
5f71894
to
569922d
Compare
569922d
to
12441d7
Compare
src/java/org/apache/cassandra/cql3/statements/schema/AlterSchemaStatement.java
Outdated
Show resolved
Hide resolved
private static int calculateAllowedLength(int keyspaceNameLength, int tableNameLength) | ||
{ | ||
// Prefixes and suffixes from IndexContext.getFullName | ||
int addedLength1 = keyspaceNameLength + tableNameLength + 2 + 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic numbers - can you make them into named constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also prefer to avoid magic numbers. In this case I started with constants, but it became unreadable due to many constants used only once and capitalized names does not look readable. I am open for suggestion how to make maintainable and correct code without investing too much time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I exploded the code with named variables for values and as result putting into different functions as recommended by IntelliJ.
// Prefixes and suffixes constructed by Version.stargazerFileNameFormat | ||
int addedLength2 = SAI_DESCRIPTOR.length() + 4 + IndexComponentType.KD_TREE_POSTING_LISTS.representation.length() + 3; | ||
// Prefixes from Descriptor constructor | ||
addedLength2 += 2 + 2 + SSTableFormat.Type.BTI.name().length() + 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: those constant numbers scream they need names
test/unit/org/apache/cassandra/index/sai/cql/QuotedIdentifierColumnTest.java
Outdated
Show resolved
Hide resolved
cc08613
to
c6cd552
Compare
Fixed usage of renames in CNDB: https://github.com/riptano/cndb/pull/12715 |
test/unit/org/apache/cassandra/index/sai/cql/NativeIndexDDLTest.java
Outdated
Show resolved
Hide resolved
*/ | ||
private static int calculateGeneratedIndexNameLength(int keyspaceNameLength, int tableNameLength) | ||
{ | ||
int uniquenessSuffixLength = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this sets a maximum of 100 indexes with a long name per keyspace. So for example a keyspace with a very long name could have 20 tables with 5 indexes each. It looks like a reasonable limit, albeit I wonder if for peace of mind we should use a max suffix length of 4 chars.
In any case, I would add a comment here about how this limits the number of indexes with a long name, and perhaps put that number in a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be fully correct it will be limited to 100 only in the case of long keyspace and table names. An index name is table + column names and it will be trimmed to allow later addition of keyspace and table names.
Currently the limit of 3 chars suffix is soft, i.e., if this condition satisfied, then later it can be more than 3 chars be added as it's not controlled there. It will not fail until the number of chars is big enough, since the name length constant of 222 chars is arbitrary to my understanding and is below the file name limit by OSs.
I avoid this corner case with uniqueness suffix, I think to control the result index name to fit the allowed size as it's now done only for the given index name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I increased to 4 chars, i.e., 999 repetition of the same index name. It should be safe enough. There is still a bug of not validating length of index names, https://github.com/riptano/cndb/issues/13198
test/unit/org/apache/cassandra/index/sai/cql/NativeIndexDDLTest.java
Outdated
Show resolved
Hide resolved
Several fixes: - Improve the name of the utility function to check database object names to provide more semantics. - Remove duplicates of utility functions and constants. - Fix warnings with minor improvements in affected files.
Also fix CNDB-12609 Trying to create an SAI on a column, which name is either long or has non-alphanumeric characters, fails, while the table was successfully created with such name. This happens if an index name is not provided. The reason for this limitation is that the column name is used in the default index name, which is used in the file names. This commit fixes this limitation by removing characters other than alphabetic, numeric or underscore when creating an index name and trimming the index name to be acceptable for a file name.
Adds tests with long keyspace and table names. Improves readability of calculating additional future length by putting values into variables.
CQLTester.createIndex assumed non-qualified column names. Thus it was necessary to adjust it, since this PR removes the restriction.
Allowed length of index name is used to check given index in addition to truncate generated index name. The generated name needs also to take in account the generated suffix. Thus, splitting in tow functions.
When an index name is generated from a column name, it keeps the same capitalization for qualified column name, while SQL tester and existing tests assume that any column name is switched to lower case only.
2f4f862
to
fc5f50e
Compare
After checking through code and test it was found that only one option is used for file name with index name in it. So it's corrected and the logic is moved into SAI. Removed validation of index name length to not break existing indexes.
ad74ff0
to
fa313cb
Compare
|
✔️ Build ds-cassandra-pr-gate/PR-1539 approved by ButlerApproved by Butler |
What is the issue
Trying to create an SAI or other index on a column, which name is either long or has
non-alphanumeric characters, fails, while the table was successfully
created with such name. This happens if an index name is not provided.
The reason for this limitation is that the column name is used in
the default index name, which is used in the file names.
What does this PR fix and why was it fixed
Fixes https://github.com/riptano/cndb/issues/12503, fixes https://github.com/riptano/cndb/issues/12609
This PR removes the check for column names during an index creation. Thus it allows to create indexes on existing columns with any names including qualified names and long names.
This change to generated index name does not affect existing indexes as the existing name from
IndexMetadata
is always serialized to JSON, thus avoiding to re-generating the name.This fix required to fix CQL tester, since it was assuming that the restriction is in place. So column name regex pattern matcher is fixed in
createIndex
to allow non-restricted column names. There might still be possible corner cases on some column names used for generating index names, which will not work withcreateIndex
. In such caseexecute
should be used.It also includes several fixes in affected files:
Checklist before you submit for review
NoSpamLogger
for log lines that may appear frequently in the logs