Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Import Speed #329

Open
Rudedog9d opened this issue Dec 14, 2024 · 7 comments
Open

Improve Import Speed #329

Rudedog9d opened this issue Dec 14, 2024 · 7 comments

Comments

@Rudedog9d
Copy link

Rudedog9d commented Dec 14, 2024

Related: #233
Related: #226

Background:

This isn't a new issue. Like others, I am experiencing some pretty slow imports when generating code based on my schema. We use Hasura and typically just export the full schema for other apps to use for codegen.

For us + Hasura specifically, it's the input_types that have the longest import time, with the calls to model_rebuild at the bottom of the file being the cause. The plugin from this comment does help a ton.

For reference, in our case, input_types ends up about 4k lines w/ about 300 lines of model_rebuild() calls.


Feature idea:

I noticed that all input types are always generated, despite our code only using a few of them.

I also noticed that the client.py file only imports the input types it needs for the actual client.

Since the codegen is already able to detect only what input types are actually needed, would it make sense to use the same logic to only generate types that are actually required? Or maybe support writing all input types to their own files within a directory, then only import what you require?

Is it possible to make something like that with a plugin?

@Rudedog9d
Copy link
Author

I guess another option could be a "InputTypeStore" or "ModelStore", where model_build is called on model retrieval, though that might be alot harder to type?

@bombsimon
Copy link
Contributor

I think a plugin could be written but maybe this should just be the default behavior. If an input type isn't used in the generated client it makes no sense to generate it. In fact, I thought this was how the generator worked today but maybe it always generates the full schema regarding types?

One issue is that Python doesn't allow you to split classes into multiple files and because of that client.py will grow huge. I tried to fix that with the ClientForwardRefs plugin which doesn't import types other than for type checking and import specific method types when the method is called. However, there are still a lot of files containing large exports such as fragments.py and input_types.py.

I think the issues got closed and the topic was somewhat dropped due to pydantic/pydantic#6748 which confirms that this is an issue with Pydantic and especially since upgrading to v2. Luckily there's been some progress lately it seems like and they're working to improve the speed of Pydantic which would have a direct impact of the generated code from this project.

@pkucmus
Copy link

pkucmus commented Dec 16, 2024

This issue is known and on our minds to fix but it'll take a while. In the meantime did you try to use:

[tool.ariadne-codegen]
...
include_comments = "none"
include_all_inputs = false
include_all_enums = false
plugins = [
    ...
    "ariadne_codegen.contrib.client_forward_refs.ClientForwardRefsPlugin",
    "ariadne_codegen.contrib.no_reimports.NoReimportsPlugin",
]

This greatly reduces the import time (as there's just that much less code).

@Rudedog9d
Copy link
Author

I think a plugin could be written but maybe this should just be the default behavior. If an input type isn't used in the generated client it makes no sense to generate it.

This is what I was thinking, too!

In fact, I thought this was how the generator worked today but maybe it always generates the full schema regarding types?

It does. And the problem is that I was exporting our full schema from Hasura, which contains quite a few ways to interact with the GQL server, even though we weren't using a majority of them.

My solution in this case was to export a schema for this specific client, which doesn't require many permissions or different schema objects - but we also use it on the NCAE Cyber Games projects, where we frequently deal with automations that can access the full schema.


I have used the ClientForwardRefsPlugin and NoReimportsPlugin plugins, but only with limited success. I still like to use the input types so we have better type safety, which ariande-codegen is great at! But that still means importing from input_types.py, which has the large number of objects/model updates at the bottom of the file.

I also tried that LazyLoadingPlugin from the comment mentioned above, which did help alot, but it does feel like a bit of a hack.

This issue is known and on our minds to fix but it'll take a while.

I absolutely don't doubt this @pkucmus, but it seems like you already have the logic to determine what's actually used in the client. I'm curious is there is a way to use this logic to only include required input types in the final input_types.py?

Any chance y'all can point me to the place in code where the selective importing for the client happens? I'm interested in exploring a solution, but I poked around a bit and wasn't really sure where to start.

One issue is that Python doesn't allow you to split classes into multiple files and because of that client.py will grow huge.

To be clear, I'm not suggesting that we split up the Client itself. For our use case, the issue is that there are 4k lines of input types defined (and 300+ model rebuilds), even though we may only have 3 operations within the client. The model rebuilds happen at the bottom of the input_types file at import time, making startup slow unless you just don't use the generated types at all (which is lameeeee!).

@Rudedog9d
Copy link
Author

I did quickly glance through pydantic/pydantic#6748, and while an improvement in pydantic would be great, I still think something could be done about eliminating unused models.

I saw somewhere in code/issues that y'all were using a flake8 thing to remove unused variables, and I was hoping I could get that to find and remove any unused exports, but I didn't have much luck in the short time I tried.

@bombsimon
Copy link
Contributor

@Rudedog9d Double checking, but did you try setting include_all_inputs to false as @pkucmus suggested? I actually didn't know about this but it seems to do exactly what you're asking:

include_all_inputs (defaults to true) - a flag specifying whether to include all inputs defined in the schema, or only those used in supplied operations

This should hopefully reduce the size and number of classes in input_types.py.

@Rudedog9d
Copy link
Author

@Rudedog9d Double checking, but did you try setting include_all_inputs to false as @pkucmus suggested? I actually didn't know about this but it seems to do exactly what you're asking:

include_all_inputs (defaults to true) - a flag specifying whether to include all inputs defined in the schema, or only those used in supplied operations

This should hopefully reduce the size and number of classes in input_types.py.

Oh.... Nope. I completely missed that, even after it was commented here 🥲 I was only paying attention to the plugins.

That's exactly what I'm looking for! I think that will solve a majority of my problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants