-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling failures in elastic flow #861
Conversation
@@ -54,3 +57,137 @@ def _get_strains(structure, sym_reduce): | |||
deformations = response[job.uuid][1].output | |||
|
|||
return [d.green_lagrange_strain for d in deformations] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a test for the whole flow or at least nake sure that the parameter won't be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a test for the whole workflow is a bit problematic because it is difficult to consistently trigger the failure of one of the pertrubations. I can manually generate a test by explicitly killing vasp for one of the perturbations, but if the tests will need to be regenerated in the future it may be annoying.
It would be easier if I could modify the INCAR of only one of the perturbations, but it seems to not be possible because update_kwargs
is not applied to dynamically generated jobs (see materialsproject/jobflow#588).
I thought these tests will be enough, since all the logic stays in the fit job, but if you think it is needed I can try to generate the workflow test by killing the job. Or maybe there is a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. This sounds like too much work.
I think the fitting part is well tested. I was rather thinking about testing that a parameter change really arrives in the fit function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was part of the output schema, it would be easy. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only missing point. As this is a rather small but still important pull request, I would in this case merge it.
Does it make sense to add the max_*
to the output schema and just test if it arrives there if you set it in the wf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I had partially misunderstood your previous comment. I could add max_failed_deformations
to the output document, but I am not sure if it is really worth to have that information just to make the test. I will try if I can quickly make a test by explicitly killing one of the deformation jobs, if it gets too complicated I will add max_failed_deformations
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to add a test with the flow failure triggered by the max_failed_deformations
being set to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Thank you!
@gpetretto Thank you! I left a comment on the overall flow. (See Update below) |
My initial review comment was unclear (sorry about this, i should not write on my phone): what I meant to say was that I would also recommend a test at the flow level. Of course, a part of the functionality is also already tested with the default values but not everything. |
@gpetretto Thank you! I will merge. @utf I am merging this now as this is a rather small change! |
* limit maximum failures in elastic flow * add property to Elastic flow subclasses * Add flow test for failiure in elastic flow
* limit maximum failures in elastic flow * add property to Elastic flow subclasses * Add flow test for failiure in elastic flow
Following what already mentioned in #817, I have implemented a few changes in the BaseElasticFlow.
max_failed_deformations
in the Flow, that will not complete the fit job if the number of failed deformation jobs exceeds that number. The default is no limit for backward compatibility.ElasticDocument
.ElasticDocument
. If the list of failed uuids is not empty, a warning is added there.