-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-34467] bump flink version to 1.20.0 #111
[FLINK-34467] bump flink version to 1.20.0 #111
Conversation
Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html) |
b9d644b
to
d07c47a
Compare
What's the purpose of bumping the version? If we don't bump, the connector can be easily used in Flink 1.17 .. 1.20 (Maven will choose the highest version automatically depending on the other flink dependencies). If we bump, the connector cannot be easily used with 1.17 .. 1.19 (you'd need to explicitly exclude the dependencies or use dependency management). Usually, we bump if the connector depends on specific API of the respective Flink version. You mentioned Lineage but I'm missing context. |
@AHeise The interface is released in Flink 1.20. I saw other connectors, for example, Cassandra and JDBC (I also need to bump the flink version for them) have been using Flink 1.19. Could we use the similar way to do the dependency management? |
The way this interface is designed will always result in a breaking change on connector side. Any connector that uses the interface will need to release a specific version just for Flink 1.20. From the interface description and usage, I'm inferring that we need to do:
Because it's designed to be a base interface, trying to load this jar in 1.19 will cause ClassLoader exceptions. It's similarly to adding new abstract methods to the Source class directly and implementing them in KafkaSource. We have two options:
and find the respective implementations through class loader scans or using SPI. Loading this class in 1.19 would fail but there is also no need to load it. We would also need to adjust In general, since this is a Table API feature, it also feels more natural to extend |
hope the community can support Flink version 1.16 due to the issue FLINK-28303. |
I don't think this is related to the discussion. According to the ticket, the bug has been solved in kafka-3.0.1. I'm guessing you running into the issue that I sketched above that we only released https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka/3.0.1-1.17 and you'd like to bundle that with flink 1.16. As mentioned above, you can try to override the flink version to 1.16 (forced downgrade) and check it out. If there are no Flink 1.17 specific APIs being used, it should just work. Note that I'd like to move away from releases targeting specific Flink versions. It's bugged anyways currently as https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka/3.0.1-1.18 is also linking to 1.17 by default. We just need a matrix saying
But I'd like to defer the discussion somewhere else and keep on focusing on making compilation possible on Flink 1.20 with 1.20 API but still also be able to run it on Flink 1.18. |
@AHeise |
I think we can drop support for flink 1.17 and flink 1.18 first in this PR #102. |
This sounds like a maintenance nightmare. Having separate branches sounds much better to me. Hudi needs to do it because they have their own release cycle independent of Flink. But the connector can correlate the release cycle to the Flink version, so we don't need to resort to such hackery. |
We broke connectors quite a bit in the past with these new APIs or changes to existing APIs (that's also why Hudi has different modules). I think we just need to craft our API extensions more carefully as outlined above. API that can be used easily internally may not be suited for external repos. In theory, we could post-process compiled classes to remove usages of 1.20 classes for 1.19 release but that's quite involved and probably hard to understand (too much magic). For now, I propose the following. We try to get as many other bugfixes in as possible. Then we do a release for 1.19 (I'm going to pick up the PR that you linked to get rid of 1.17 and 1.18). Then we merge this PR and add lineages and create a new minor (or even major). If we need to patch 1.19, we fork from 1.19 minor and do a patch release. |
@AHeise |
@AHeise What amount of time do you think is needed for this to accomplish? Day, weeks or perhaps months? In case of longer period, would it make sense to think of some temporar workarounds? A hacky way would be to copy the interfaces from (https://github.com/apache/flink/pull/23626/files) into the kafka connector codebase and exclude from a built jar. This would work as interface definition in Java is needed only when interface methods are called and this would happen only for Flink 1.20 or later, which contain those definitions. Copied classes could be abandoned once package gets updated to 1.20. Just thinking loudly as I see the interest and huge potential in finishing the lineage mechanism and would love to avoid blocking the progress on that. |
It's going to be a few weeks. We don't need to wait for the release to go through though; we just need to get all fixes in.
This would be news to be me but an awesome solution (I don't think we need to copy the interfaces though). Can you provide a POC that shows that this works? Afaik the classloader needs to load all base classes/interfaces on class resolution (for things like instanceof checks). |
@AHeise I am trying to work on that within this PR -> https://github.com/pawel-big-lebowski/flink-connector-kafka/pull/1/files. What would be the best way to show whether it works or not? I believe it does bcz of Java lazy class loading but not sure how to prove this via flink Kafka connector end-to-end tests. |
I'm guessing the easiest way to check it is to simply include some dependency as I don't think you need to use this repo to make your point - you can also just setup some small POC repo. Or simplify a fork as much as needed. |
Another way to do it could be to use flink 1.20 for compiling (e.g. |
@AHeise That could work as well. But the bigger problem is that |
@@ -1,89 +1,5 @@ | |||
org.apache.flink.connector.kafka.sink.FlinkKafkaInternalProducerITCase does not satisfy: only one of the following predicates match:\ | |||
* reside in a package 'org.apache.flink.runtime.*' and contain any fields that are static, final, and of type InternalMiniClusterExtension and annotated with @RegisterExtension\ | |||
* reside outside of package 'org.apache.flink.runtime.*' and contain any fields that are static, final, and of type MiniClusterExtension and annotated with @RegisterExtension\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious what the thinking is to remove all of these checks? Is this something that is required to move to 1.20?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is auto removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. Given we need to support 1.18 and 1.19, I think these need to be kept.
581c5b1
to
1534c2d
Compare
1534c2d
to
2b4ec36
Compare
@AHeise |
Awesome work, congrats on your first merged pull request! |
Update Flink version 1.20.0 to prepare support Lineage integration