Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple models on single plot and ensuring names get passed through to legend. #1110

Open
jwarner8 opened this issue Feb 3, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@jwarner8
Copy link
Contributor

jwarner8 commented Feb 3, 2025

If we are able to plot multiple lines on a single plot for different models, we need to have a legend which distinguishes which model is which. We can either pass the model name environment variable through to the plot operator, or add some metadata to the cube which contains the model name.

@jwarner8 jwarner8 added enhancement New feature or request question Further information is requested labels Feb 3, 2025
@jwarner8 jwarner8 changed the title Passing model names through to plot operator for legend. Multiple models on single plot and ensuring names get passed through to legend. Feb 5, 2025
@jwarner8
Copy link
Contributor Author

jwarner8 commented Feb 5, 2025

We can start by enabling multiple lines in a single plot for case study aggregated domain mean average time series. At the moment, the recipes are designed that each model is treated separately in the workflow, and a plot is produced for that particular model. The exception here is the difference plots, where each model is iterated over in the includes file, and then at the end the difference plot is created because misc.difference is called in the recipe.

I am not sure of the best way to implement this as standard. My personal preference is that if more than one model is supplied in the GUI, then CSET will plot each model as a line in operators that produce a line (so histograms, domain mean time series). There would have to be some logic in the includes file that if model=1, then run some_single_model_recipe, whereas if there is two or more, then it would run multiple_model_recipe. In the latter, we could then iterate over each model, as in {% for model in models %}.

I am not clear how this would work with the case study aggregation, as we would want to ensure that each model is a single cube (collapsed by reference time, or otherwise), and thus the plot operator is passed a cubelist with each model a cube. A second question is whether we pass the model name as an environment variable through the workflow, or add it as an attribute. If we do the later, it shouldn't prevent merges/concats as all cubes belonging to a particular model will have the same attribute model_name. Thoughts below about best way to proceed with this.

@jwarner8
Copy link
Contributor Author

jwarner8 commented Feb 5, 2025

I have been thinking more about this and I think there is the risk of more convoluted recipes/includes if we have another set that enable plotting multiple models on the same plot. A solution which I prefer is where we use the output from the recipes in the plot folder, and load these in and use these to generate either case study aggregation, or plot with multiple lines. In the case of both of these, you would load each model output and merge into a single cube, repeat for other models, leaving a cubelist with two cubes (of each model). This could then be passed to the plot operator.

I know we originally intended for that cube output in the plot folder to be useful to a user who wishes to use the output elsewhere, but currently there is a lot of duplication in processing in case aggregation (doing the exact same cube processing as other recipes), which will scale poorly for large domains and ensembles, particularly for large trials.

Is there a key reason why we are not using this postprocessed output? It introduces dependencies on a recipe to finish, before another recipe (the aggregation, or 'plotting' recipe) to start, but is that a problem? Can we not design a trigger in CYLC to run this once previous recipe tasks have completed (not just in the cycle point but reference to equivalent tasks in other cycle points)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants