-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardized option for short jobs #546
Comments
Funnily enough I had also been implementing this but in jobflow-remote directly (and discussing earlier today with @VicTrqt who was running into similar issues). I've started adding the option of @job(profile="analysis") or @job(profile="postprocessing") which can then be used in jobflow-remote config to specify a default worker and exec config for jobs that match the profile. If this could be standardized at the jobflow level it would be super helpful as other managers can also make use of it. The same issue comes up with e.g., jobs that require (or at least can make use of) GPUs. I don't know whether it is necessary to have a set of "known" profiles or whether this can be handled by convention (either way the user probably has to choose the appropriate resources for a 'small' job) |
Nice! I like the idea, as it allows more flexibility. On the other hand it requires a bit more work on the configuration from the user. However, the instruction for a standard worker that covers these cases can be provided in the documentation. An additional point that concerns more jobflow-remote is that it could be needed to know if a job can be executed with just the inputs, or if it needs to have access to some files from previous jobs. For example this function in atomate2: https://github.com/materialsproject/atomate2/blob/7f4d5a60d427295dee3a0f6a9b87deb5f47d7f8a/src/atomate2/common/jobs/defect.py#L187 is clearly something that could be executed quickly, but I think that it needs to be executed on the machine where the previous jobs were executed. Not sure if there is any easy way to define or identify these kind of jobs. In any case, I believe this needs to be implemented directly in jobflow to be effective. |
Files are definitely a big blocker for me too; not sure how to approach this with the current API (have played around a bit with additional stores but it doesn't quite make sense to me). Being able to launch a job from the context of an older job (as resolved by the manager) would be very helpful, as would resolving dependencies on data present in additional stores. |
Absolutely amazing idea, and I love the design proposed by @ml-evs. The way I have been getting around this is very hacky with FireWorks... |
One caveat here: if you send some jobs to the local compute resource, this would require all runtime dependencies to also be present there (which may not necessarily be the case). |
In several Flows there is the need to define small and short Jobs. In
atomate2
there are Jobs that just calculate a supercell or generate perturbations. While these require a minimal computational effort, identifying them and tune their execution may be quite annoying.For this reason I would be interested in defining a standard way of marking such jobs, so that managers can then automatically optimize their execution. In the case of
jobflow-remote
there could be an internal local worker that would automatically execute those jobs. Some helper could probably be defined for the fireworks manager as well.They key point is that it should be possible for the
Flow
developer to directly select these jobs, instead of being on the shoulders of the user. I am not sure what would be the best way of doing that. I was thinking of a newJob
(orJobConfig
) attribute likefast
,short
small
that is False by default. So it would be easily marked and easily retrieved by the manager. For example:Any comments or ideas about this feature?
The text was updated successfully, but these errors were encountered: