Improve Import Speed #329

Rudedog9d · 2024-12-14T18:11:48Z

Related: #233
Related: #226

Background:

This isn't a new issue. Like others, I am experiencing some pretty slow imports when generating code based on my schema. We use Hasura and typically just export the full schema for other apps to use for codegen.

For us + Hasura specifically, it's the input_types that have the longest import time, with the calls to model_rebuild at the bottom of the file being the cause. The plugin from this comment does help a ton.

For reference, in our case, input_types ends up about 4k lines w/ about 300 lines of model_rebuild() calls.

Feature idea:

I noticed that all input types are always generated, despite our code only using a few of them.

I also noticed that the client.py file only imports the input types it needs for the actual client.

Since the codegen is already able to detect only what input types are actually needed, would it make sense to use the same logic to only generate types that are actually required? Or maybe support writing all input types to their own files within a directory, then only import what you require?

Is it possible to make something like that with a plugin?

The text was updated successfully, but these errors were encountered:

Rudedog9d · 2024-12-14T18:16:54Z

I guess another option could be a "InputTypeStore" or "ModelStore", where model_build is called on model retrieval, though that might be alot harder to type?

bombsimon · 2024-12-14T21:18:02Z

I think a plugin could be written but maybe this should just be the default behavior. If an input type isn't used in the generated client it makes no sense to generate it. In fact, I thought this was how the generator worked today but maybe it always generates the full schema regarding types?

One issue is that Python doesn't allow you to split classes into multiple files and because of that client.py will grow huge. I tried to fix that with the ClientForwardRefs plugin which doesn't import types other than for type checking and import specific method types when the method is called. However, there are still a lot of files containing large exports such as fragments.py and input_types.py.

I think the issues got closed and the topic was somewhat dropped due to pydantic/pydantic#6748 which confirms that this is an issue with Pydantic and especially since upgrading to v2. Luckily there's been some progress lately it seems like and they're working to improve the speed of Pydantic which would have a direct impact of the generated code from this project.

pkucmus · 2024-12-16T08:34:35Z

This issue is known and on our minds to fix but it'll take a while. In the meantime did you try to use:

[tool.ariadne-codegen]
...
include_comments = "none"
include_all_inputs = false
include_all_enums = false
plugins = [
    ...
    "ariadne_codegen.contrib.client_forward_refs.ClientForwardRefsPlugin",
    "ariadne_codegen.contrib.no_reimports.NoReimportsPlugin",
]

This greatly reduces the import time (as there's just that much less code).

Rudedog9d · 2024-12-18T15:51:38Z

I think a plugin could be written but maybe this should just be the default behavior. If an input type isn't used in the generated client it makes no sense to generate it.

This is what I was thinking, too!

In fact, I thought this was how the generator worked today but maybe it always generates the full schema regarding types?

It does. And the problem is that I was exporting our full schema from Hasura, which contains quite a few ways to interact with the GQL server, even though we weren't using a majority of them.

My solution in this case was to export a schema for this specific client, which doesn't require many permissions or different schema objects - but we also use it on the NCAE Cyber Games projects, where we frequently deal with automations that can access the full schema.

I have used the ClientForwardRefsPlugin and NoReimportsPlugin plugins, but only with limited success. I still like to use the input types so we have better type safety, which ariande-codegen is great at! But that still means importing from input_types.py, which has the large number of objects/model updates at the bottom of the file.

I also tried that LazyLoadingPlugin from the comment mentioned above, which did help alot, but it does feel like a bit of a hack.

This issue is known and on our minds to fix but it'll take a while.

I absolutely don't doubt this @pkucmus, but it seems like you already have the logic to determine what's actually used in the client. I'm curious is there is a way to use this logic to only include required input types in the final input_types.py?

Any chance y'all can point me to the place in code where the selective importing for the client happens? I'm interested in exploring a solution, but I poked around a bit and wasn't really sure where to start.

One issue is that Python doesn't allow you to split classes into multiple files and because of that client.py will grow huge.

To be clear, I'm not suggesting that we split up the Client itself. For our use case, the issue is that there are 4k lines of input types defined (and 300+ model rebuilds), even though we may only have 3 operations within the client. The model rebuilds happen at the bottom of the input_types file at import time, making startup slow unless you just don't use the generated types at all (which is lameeeee!).

Rudedog9d · 2024-12-18T15:55:01Z

I did quickly glance through pydantic/pydantic#6748, and while an improvement in pydantic would be great, I still think something could be done about eliminating unused models.

I saw somewhere in code/issues that y'all were using a flake8 thing to remove unused variables, and I was hoping I could get that to find and remove any unused exports, but I didn't have much luck in the short time I tried.

bombsimon · 2024-12-26T11:24:25Z

@Rudedog9d Double checking, but did you try setting include_all_inputs to false as @pkucmus suggested? I actually didn't know about this but it seems to do exactly what you're asking:

include_all_inputs (defaults to true) - a flag specifying whether to include all inputs defined in the schema, or only those used in supplied operations

This should hopefully reduce the size and number of classes in input_types.py.

Rudedog9d · 2024-12-26T15:02:10Z

@Rudedog9d Double checking, but did you try setting include_all_inputs to false as @pkucmus suggested? I actually didn't know about this but it seems to do exactly what you're asking:

include_all_inputs (defaults to true) - a flag specifying whether to include all inputs defined in the schema, or only those used in supplied operations

This should hopefully reduce the size and number of classes in input_types.py.

Oh.... Nope. I completely missed that, even after it was commented here 🥲 I was only paying attention to the plugins.

That's exactly what I'm looking for! I think that will solve a majority of my problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Import Speed #329

Improve Import Speed #329

Rudedog9d commented Dec 14, 2024 •

edited

Loading

Rudedog9d commented Dec 14, 2024

bombsimon commented Dec 14, 2024

pkucmus commented Dec 16, 2024 •

edited

Loading

Rudedog9d commented Dec 18, 2024

Rudedog9d commented Dec 18, 2024

bombsimon commented Dec 26, 2024

Rudedog9d commented Dec 26, 2024

Improve Import Speed #329

Improve Import Speed #329

Comments

Rudedog9d commented Dec 14, 2024 • edited Loading

Rudedog9d commented Dec 14, 2024

bombsimon commented Dec 14, 2024

pkucmus commented Dec 16, 2024 • edited Loading

Rudedog9d commented Dec 18, 2024

Rudedog9d commented Dec 18, 2024

bombsimon commented Dec 26, 2024

Rudedog9d commented Dec 26, 2024

Rudedog9d commented Dec 14, 2024 •

edited

Loading

pkucmus commented Dec 16, 2024 •

edited

Loading