Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
For reference, at the current stage, a database can be migrated (requires write access) using: |
There was a problem hiding this comment.
Pull request overview
This PR extends the API workflow to accept a structure_seed parameter for structure generation, ensuring that identical compositions/settings can produce different initial structures (and thus distinct jobs) when the seed changes, and that the seed is included in job hashing for caching/deduplication.
Changes:
- Add
structure_seedtoMeltQuenchParamsand include it in the job hash viasubmission.simulation.model_dump(). - Forward
structure_seedto core structure generation asrandom_seedingenerate_structure(). - Add API and workflow tests to verify hashing, forwarding, and request acceptance.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| amorphouspy_api/src/amorphouspy_api/models.py | Adds structure_seed to the simulation parameter model exposed by the API. |
| amorphouspy_api/src/amorphouspy_api/workflows/meltquench.py | Passes the new seed through to get_structure_dict(..., random_seed=...) during structure generation. |
| amorphouspy_api/src/tests/test_jobs.py | Adds tests covering hash changes, seed forwarding, and POST acceptance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Thanks @Gitdowski, can you add this to the documentation as well! |
|
@Atilaac: Good catch. Is this another place in the documentation where you would add a word on this, or is this fine? |
|
|
||
| If specified manually, using (combinations of) an unrealistically high `density`, too high `min_distance`, too many `target_atoms` or `n_molecules`, or too few `max_attempts_per_atom` can lead to placement failures. | ||
|
|
||
| Internally, the random placement of atoms is controlled by the `structure_seed` parameter. This ensures reproducibility on the one hand. On the other hand, if statistics are checked and the same system is simulated several times it is recommended to use different seeds for each run to get a better sampling of the configuration space. |
There was a problem hiding this comment.
Can you check: at the moment, if I specify the same structure_seed, do I get the same structure?
My suspicion would be: yes, on the same computer; maybe not on different computers.
What you may want to communicate here is: if you use a fixed structure_seed , you may get the same structure (not guaranteed). If you use a different structure_seed, you are guaranteed to get a different structure
There was a problem hiding this comment.
I changed the RNG to explicitely call the PCG64, PreComputed Generator, instead of relying on the default_rng() (which currently is the PCG64). Therefore, we are safe in case numpy decides to switch to a different RNG for a future release.
Furthermore, PCG64 is frozen and designed to be reproducible across different CPU achitectures and operating systems.
I just checked this for a small system on two different machines. With the default seed the same structures are generated on both machines. Also using a non-default seed produces the same structures on both machines (and of course different from the structures generated by the default seed).
Following PR #332, where a random seed to the
get_structure_dict()has been added, this PR adds the new input argumentstructure_seedto the API workflow. If now a replica simulation is started with the same composition, system size and settings, but with a differentstructure_seedit will trigger a new simulation.Also, the
structure_seedis also part of the hashing. Therefore, an existing database needs to be migrated.structure_seedto theMeltQuenchParamsclass and thegenerate_structure()