Skip to content

Feat/seed for structure generation#336

Open
Gitdowski wants to merge 7 commits intomainfrom
feat/seed-for_structure-generation
Open

Feat/seed for structure generation#336
Gitdowski wants to merge 7 commits intomainfrom
feat/seed-for_structure-generation

Conversation

@Gitdowski
Copy link
Copy Markdown
Contributor

Following PR #332, where a random seed to the get_structure_dict() has been added, this PR adds the new input argument structure_seed to the API workflow. If now a replica simulation is started with the same composition, system size and settings, but with a different structure_seed it will trigger a new simulation.
Also, the structure_seed is also part of the hashing. Therefore, an existing database needs to be migrated.

  • added structure_seed to the MeltQuenchParams class and the generate_structure()
  • added tests

@Gitdowski Gitdowski requested a review from Copilot May 6, 2026 13:20
@github-actions github-actions Bot added the type: feature Changelog: new feature or performance improvement → bumps minor version label May 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Gitdowski
Copy link
Copy Markdown
Contributor Author

For reference, at the current stage, a database can be migrated (requires write access) using:

# run once with app env
from amorphouspy_api.database import get_job_store, Job
from amorphouspy_api.models import JobSubmission
from amorphouspy_api.routers.jobs_helpers import _job_hash

store = get_job_store()
with store.session() as s:
    rows = s.query(Job).filter(Job.status == "completed").all()
    for row in rows:
        req = row.request_data or {}
        sim = req.get("simulation", {})
        if "structure_seed" not in sim:
            sim["structure_seed"] = 42
            req["simulation"] = sim
            sub = JobSubmission(**req)
            new_hash = _job_hash(sub, row.composition)
            row.request_hash = new_hash
            row.request_data = req
    s.commit()

@Gitdowski Gitdowski requested a review from ltalirz May 6, 2026 13:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the API workflow to accept a structure_seed parameter for structure generation, ensuring that identical compositions/settings can produce different initial structures (and thus distinct jobs) when the seed changes, and that the seed is included in job hashing for caching/deduplication.

Changes:

  • Add structure_seed to MeltQuenchParams and include it in the job hash via submission.simulation.model_dump().
  • Forward structure_seed to core structure generation as random_seed in generate_structure().
  • Add API and workflow tests to verify hashing, forwarding, and request acceptance.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
amorphouspy_api/src/amorphouspy_api/models.py Adds structure_seed to the simulation parameter model exposed by the API.
amorphouspy_api/src/amorphouspy_api/workflows/meltquench.py Passes the new seed through to get_structure_dict(..., random_seed=...) during structure generation.
amorphouspy_api/src/tests/test_jobs.py Adds tests covering hash changes, seed forwarding, and POST acceptance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amorphouspy_api/src/amorphouspy_api/models.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Atilaac
Copy link
Copy Markdown
Contributor

Atilaac commented May 6, 2026

Thanks @Gitdowski, can you add this to the documentation as well!

@Gitdowski
Copy link
Copy Markdown
Contributor Author

@Atilaac: Good catch. Is this another place in the documentation where you would add a word on this, or is this fine?

Comment thread docs/theory/structure.md Outdated

If specified manually, using (combinations of) an unrealistically high `density`, too high `min_distance`, too many `target_atoms` or `n_molecules`, or too few `max_attempts_per_atom` can lead to placement failures.

Internally, the random placement of atoms is controlled by the `structure_seed` parameter. This ensures reproducibility on the one hand. On the other hand, if statistics are checked and the same system is simulated several times it is recommended to use different seeds for each run to get a better sampling of the configuration space.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check: at the moment, if I specify the same structure_seed, do I get the same structure?
My suspicion would be: yes, on the same computer; maybe not on different computers.

What you may want to communicate here is: if you use a fixed structure_seed , you may get the same structure (not guaranteed). If you use a different structure_seed, you are guaranteed to get a different structure

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the RNG to explicitely call the PCG64, PreComputed Generator, instead of relying on the default_rng() (which currently is the PCG64). Therefore, we are safe in case numpy decides to switch to a different RNG for a future release.
Furthermore, PCG64 is frozen and designed to be reproducible across different CPU achitectures and operating systems.

I just checked this for a small system on two different machines. With the default seed the same structures are generated on both machines. Also using a non-default seed produces the same structures on both machines (and of course different from the structures generated by the default seed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature Changelog: new feature or performance improvement → bumps minor version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants