Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build an iterative phonon flow #306

Merged
merged 113 commits into from
Jan 9, 2025
Merged

Build an iterative phonon flow #306

merged 113 commits into from
Jan 9, 2025

Conversation

JaGeo
Copy link
Collaborator

@JaGeo JaGeo commented Dec 21, 2024

Very first draft for an iterative phonon flow. There are still many, many things to solve before this can work.

  • make sure that a new random seed is used for each step (make sure the order of the settings has been considered in the seed computation)

  • reuse the previous database

  • make sure that the phonon benchmark runs are reused as well

  • expose all relevant outputs to allow for an iterative procedure

  • random_seed + len(workflow_maker.volume_custom_scale_factors)

  • fix previous unit tests

  • fix reading mace models

  • improve distort_type_1

  • make sure random seed creation is done correctly when very similar structures are provided in "structures" - currently, this is not correctly considered (needs to be done in iterative flow as well)

  • fix write_benchmark_metrics

  • document that only 0.01 displacement is used in the benchmark

  • add a warning if single displaced supercells are activated and run after iteration 0 (via postinit method)

  • do some testing on the cluster to confirm the options and capabilities

  • additional data ends up in text.extxyz. While this isn't directly used for fitting and plotting, this should not happen as users will not be able to understand where this data comes from.

  • make rmse computation more robust (add a try except block)

  • make sure all relevant data is part of the datastore (failures in do_iterative_*)

These points will be moved into a subsequent issue.

  • Optimized structures might be reused in the next generation and additional data as well!
  • add better tests for random structure generation and random seeds!!!!
  • Further feature ideas: only add new rss structures where benchmark results is not yet good enough
  • Clean up test data: I made more strict tests and had to copy the previous test data to a new folder. We could try to clean it up. However, it should not lead to less strictly done tests. We need a test that ensures that the random seed is upgraded

@JaGeo
Copy link
Collaborator Author

JaGeo commented Dec 22, 2024

Disclaimer: This is a small programming project that I might work on for fun in the next weeks.

@JaGeo JaGeo marked this pull request as draft December 22, 2024 18:05
@JaGeo JaGeo changed the title Build an iterative phonon flow WIP: Build an iterative phonon flow Dec 22, 2024
@JaGeo
Copy link
Collaborator Author

JaGeo commented Jan 8, 2025

@QuantumChemist could you please review this PR? The code is working for me but along the way I also fixed other problems (glue.xml, test.ext.xyz full with additional data, removal of different displacements for the benchmark) that I only partially was able to test. I haven't tested the glue.xml part and there was no test in our current tests for that part.
Just make sure I am not breaking something by accident.

if suffix == "without_regularization":
suffix = "without_reg"
if re.match(r"job_\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}-\d{6}-\d{5}", suffix):
if suffix not in ["phonon", "rattled"]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old version only works for jobflow but not jobflow remote

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also my mistake for not thinking about the jobflow remote structure of jobs when implementing this.

@QuantumChemist
Copy link
Collaborator

@QuantumChemist could you please review this PR? The code is working for me but along the way I also fixed other problems (glue.xml, test.ext.xyz full with additional data, removal of different displacements for the benchmark) that I only partially was able to test. I haven't tested the glue.xml part and there was no test in our current tests for that part. Just make sure I am not breaking something by accident.

we have one unit test for the MLIPFitMaker using glue.xml and I have extended it a bit to verify that the fit keeps working.

Comment on lines 802 to 812
def get_output(
metrics: list,
benchmark_structures: list[Structure] | None = None,
benchmark_mp_ids: list[str] | None = None,
dft_references: list[PhononBSDOSDoc] | None = None,
pre_xyz_files: list[str] | None = None,
pre_database_dir: str | None = None,
fit_kwargs_list: list | None = None,
):
"""
Job to collect all output infos for potential restarts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing I see code-wise is that this function name and its description coule be more descriptive. I needed to look at the unit test to fully get it 😃

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otheriwse this PR looks good to me. :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to add more description 😀

@JaGeo
Copy link
Collaborator Author

JaGeo commented Jan 8, 2025

@QuantumChemist could you please review this PR? The code is working for me but along the way I also fixed other problems (glue.xml, test.ext.xyz full with additional data, removal of different displacements for the benchmark) that I only partially was able to test. I haven't tested the glue.xml part and there was no test in our current tests for that part. Just make sure I am not breaking something by accident.

we have one unit test for the MLIPFitMaker using glue.xml and I have extended it a bit to verify that the fit keeps working.

Thanks. Yes, as a general feedback on the tests: we feel we need more asserts. Currently, we have many run-through tests. Maybe, you can open this as a new issue and identify tests where this is the case.

@JaGeo JaGeo merged commit 50bccd1 into main Jan 9, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants