Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/api-reference/atomic.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Atom definitions, element properties, electronic structure (SCF), orbital evalua
## Pretraining

```{eval-rst}
.. autoclass:: jaqmc.utils.atomic.pretrain.PretrainReferenceConfig

.. autofunction:: jaqmc.utils.atomic.pretrain.make_pretrain_log_amplitude
.. autofunction:: jaqmc.utils.atomic.pretrain.make_pretrain_loss
```
3 changes: 2 additions & 1 deletion docs/guide/estimators/ecp.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ where $r = |\mathbf{r} - \mathbf{R}|$ is the electron-atom distance, $\mathbf{r}

## See also

- The ECP estimator is automatically added when `ecp` is set in the system configuration. See [Basis Sets and ECPs](#molecule-basis-sets-and-ecps).
- The ECP estimator is automatically added when `ecp` is set in the system
configuration. See the molecule guide's [ECP setup](#molecule-ecps).
- Configuration: [Molecule](#molecule-estimators), [Solid](#solid-estimators)
- API: {class}`~jaqmc.estimator.ecp.estimator.ECPEnergy`
116 changes: 96 additions & 20 deletions docs/systems/molecule/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ boundary conditions. Most runs start from a YAML definition and a
single `jaqmc molecule train` command. JaQMC then follows the standard
molecular workflow:

1. **Hartree-Fock (HF)** computes reference orbitals with PySCF.
1. **Hartree-Fock (HF)** computes a reference electronic-structure solution with
PySCF.
2. **Pretraining** matches the neural wavefunction to those orbitals.
3. **VMC training** performs the main energy optimization.

Expand Down Expand Up @@ -39,6 +40,12 @@ system:
electron_spins: [5, 5] # [n_up, n_down]
```

`electron_spins` gives `[n_up, n_down]` for the electrons included in the QMC
simulation. This water example is all-electron, so `[5, 5]` includes all ten
electrons. If you later add an ECP, leave out the core electrons it replaces;
`electron_spins` should count only the valence electrons JaQMC samples
explicitly.

Then run training:

```bash
Expand Down Expand Up @@ -76,14 +83,14 @@ that generate the underlying configuration for you.
### Single Atoms

For a single atom, `system.module=atom` is a shortcut. You provide the element
symbol and optional HF settings, and JaQMC fills in the matching electron spin
configuration automatically.
symbol, and JaQMC fills in the matching electron spin configuration
automatically. By default it uses the all-electron count; when `system.ecp` is
set, it uses the valence count instead.

```yaml
system:
module: atom
symbol: Li # Element symbol (H, He, Li, Be, ...)
basis: sto-3g # Basis set for SCF initialization
# ecp: ccecp # Optional: effective core potential
```

Expand All @@ -96,17 +103,17 @@ jaqmc molecule train --yml atom_li.yml workflow.save_path=./runs/atom_li
### Diatomic Molecules

For common two-atom systems, `system.module=diatomic` is a shortcut. You provide
the chemical formula, bond length, and optional total spin; JaQMC places the
atoms along the z-axis and computes `electron_spins` for you.
the chemical formula, bond length, and optional spin for the simulated
electrons. JaQMC places the atoms along the z-axis and computes
`electron_spins` for you.

```yaml
system:
module: diatomic
formula: LiH # Chemical formula (H2, LiH, N2, ClF, ...)
bond_length: 3.015 # Distance between atoms
unit: bohr # Length unit for bond_length
spin: 0 # n_up - n_down for the full molecule
basis: cc-pvdz
spin: 0 # n_up - n_down for electrons being simulated
```

Save as `li_h_diatomic.yml`, then run:
Expand All @@ -115,36 +122,105 @@ Save as `li_h_diatomic.yml`, then run:
jaqmc molecule train --yml li_h_diatomic.yml workflow.save_path=./runs/li_h_diatomic
```

(molecule-basis-sets-and-ecps)=
## Basis Sets and ECPs
(molecule-ecps)=
## Effective core potentials

Most examples above are all-electron calculations: JaQMC represents every
electron in the molecule explicitly. For heavier elements, you may instead
replace core electrons with an effective core potential (ECP). The core
electrons no longer appear as QMC electrons; their effect enters through the
pseudopotential, while JaQMC samples the remaining valence electrons.

Enable an ECP by setting `system.ecp`:

```yaml
system:
ecp: ccecp
```

Use an ECP designed for correlated many-body calculations rather than a
DFT-only pseudopotential. The correlation-consistent ECP family, `ccecp`, is the
usual choice for QMC runs.

Once an ECP is enabled, `electron_spins` describes the electrons being sampled,
not the full electron count of the physical atoms. The `atom` and `diatomic`
shortcuts use `system.ecp` to choose the valence count automatically. If you
define `atoms` and `electron_spins` directly, set `electron_spins` to the
valence-electron system you want to simulate.

For mixed systems, apply ECPs only to the elements that need them:

```yaml
system:
ecp:
Fe: ccecp
```

(molecule-pretrain-reference)=
## Pretrain reference settings

The `basis` parameter controls the basis set used for the HF calculation. Any basis set supported by PySCF works:
`pretrain.reference.*` configures the PySCF Hartree-Fock calculation used to
generate the target orbitals for pretraining. In most runs, the basis is the
only reference setting you need to choose. The default is cc-pVDZ, and you can
change it with:

- Minimal: `sto-3g` (default, fast)
- Split-valence: `6-31g`, `6-311g`
- Correlation-consistent: `cc-pvdz`, `cc-pvtz`, `cc-pvqz`
```yaml
pretrain:
reference:
basis: sto-3g
```

For heavy elements (transition metals, lanthanides), use an effective core potential (ECP) to replace core electrons with a pseudopotential, reducing the number of electrons treated explicitly:
If the system uses an ECP, choose a pretrain basis that matches that
pseudopotential. For example, with ccECP use the corresponding ccECP basis
family:

```yaml
system:
module: atom
symbol: Fe
basis: ccecpccpvdz
ecp: ccecp
pretrain:
reference:
basis: ccecpccpvdz
```

Both `basis` and `ecp` can be specified per element:
For mixed systems, keep the same per-element split between the physical system
and the HF reference: put ECPs in `system.ecp`, and put matching PySCF basis
choices in `pretrain.reference.basis`.

```yaml
system:
basis:
Fe: ccecpccpvdz
O: cc-pvdz
ecp:
Fe: ccecp
pretrain:
reference:
basis:
Fe: ccecpccpvdz
O: cc-pvdz
```

When the HF calculation itself needs tuning, use the `pretrain.reference.*`
block for PySCF solver settings. JaQMC supports
`pretrain.reference.method` (`UHF` or `RHF`) and forwards additional keys to the
selected PySCF mean-field object.

```yaml
pretrain:
reference:
method: RHF
basis: cc-pvdz
conv_tol: 1.0e-10
max_cycle: 200
diis_space: 12
```

Use these extra keys for SCF convergence and solver behavior tuning, such as
`conv_tol`, `max_cycle`, and related PySCF options. If a key is not supported by
the selected PySCF object, JaQMC ignores it and logs a warning.

For authoritative key definitions and defaults under `pretrain.reference.*`, see
<project:train.md>.

## Estimators

The training stage computes energy from several components: kinetic energy,
Expand Down
10 changes: 10 additions & 0 deletions docs/systems/molecule/train.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,16 @@ Initializes the neural network to approximate Hartree-Fock orbitals before VMC
training. It uses the same run, sampler, and writer schemas as the train stage,
but with a different optimizer default and a workflow-wired supervised loss.

### Reference (`pretrain.reference.*`)

The Hartree-Fock reference is the PySCF calculation JaQMC uses to generate the
target orbitals for pretraining. Most runs can keep the default settings.

```{eval-rst}
.. config-defaults:: jaqmc.app.molecule.config.base.MoleculePretrainReferenceConfig
:prefix: pretrain.reference
```

### Run options (`pretrain.run.*`)

```{eval-rst}
Expand Down
11 changes: 11 additions & 0 deletions docs/systems/solid/eval.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,17 @@ are identical to the [training system config](#solid-train-system).
Must match the training run. The effective defaults and built-in module choices
are identical to the [training wavefunction config](#solid-train-wf).

## Reference (`reference.*`)

The Hartree-Fock reference is the PySCF calculation JaQMC uses when it needs
reference orbitals or related setup from that calculation. Itis recommended to
set the values to match the reference configuration used during training.

```{eval-rst}
.. config-defaults:: jaqmc.app.solid.config.base.SolidPretrainReferenceConfig
:prefix: reference
```

## Run Options (`run.*`)

Evaluation reuses the same checkpointing and sampling controls as training, but
Expand Down
74 changes: 65 additions & 9 deletions docs/systems/solid/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ boundary conditions. Most runs start from a YAML definition and a
single `jaqmc solid train` command. JaQMC then follows the same three-stage
workflow used for [molecules](../molecule/index.md):

1. **Hartree-Fock (HF)** computes reference orbitals with PySCF.
1. **Hartree-Fock (HF)** computes a reference electronic-structure solution with
PySCF.
2. **Pretraining** matches the neural wavefunction to those orbitals.
3. **VMC training** performs the main energy optimization.

Expand Down Expand Up @@ -47,9 +48,15 @@ system:
- symbol: H
coords: [3.78, 3.78, 3.78]
electron_spins: [2, 2] # [n_up, n_down] per primitive cell
basis: sto-3g
```

`electron_spins` is counted per primitive cell and describes the electrons JaQMC
samples explicitly. In an all-electron solid, that is the full electron count
per primitive cell. With an ECP, core electrons are replaced by the
pseudopotential, so `electron_spins` should count only the valence electrons.
If you later expand to a supercell, JaQMC multiplies these primitive-cell counts
by the number of primitive cells in the supercell.

Then run training:

```bash
Expand Down Expand Up @@ -78,7 +85,9 @@ that generate the underlying configuration for you.

For FCC rock-salt structures such as LiH or NaCl, `system.module=rock_salt` is
a shortcut. You provide the species and lattice constant, and JaQMC builds the
primitive cell and fills in the corresponding electron counts automatically.
primitive cell and fills in the corresponding electron counts automatically. It
uses all-electron counts by default; when `system.ecp` is set, it uses valence
counts instead.

```yaml
system:
Expand All @@ -88,7 +97,6 @@ system:
lattice_constant: 4.0 # in angstrom by default
unit: angstrom # or "bohr"
# supercell: [2, 2, 2] # Optional diagonal supercell shorthand
basis: sto-3g
```

Save as `rock_salt.yml`, then run:
Expand All @@ -100,8 +108,9 @@ jaqmc solid train --yml rock_salt.yml workflow.save_path=./runs/rock_salt
### Two-Atom Chain

For simple one-dimensional test systems, `system.module=two_atom_chain` is a
shortcut. You provide the element, bond length, and optional spin; JaQMC builds
a primitive cell with two atoms along the chain direction.
shortcut. You provide the element, bond length, and optional spin for the
simulated electrons. JaQMC builds a primitive cell with two atoms along the
chain direction.

```yaml
system:
Expand All @@ -111,7 +120,6 @@ system:
unit: bohr # or "angstrom"
spin: 0 # n_up - n_down per primitive cell
# supercell: 4 # Optional repetition along the chain direction
basis: sto-3g
```

Save as `two_atom_chain.yml`, then run:
Expand All @@ -120,8 +128,56 @@ Save as `two_atom_chain.yml`, then run:
jaqmc solid train --yml two_atom_chain.yml workflow.save_path=./runs/two_atom_chain
```

Basis sets and ECPs work the same as for
[molecules](#molecule-basis-sets-and-ecps).
(solid-ecps)=
## Effective core potentials

Solids use the same ECP mechanism as molecules: core electrons are replaced by a
pseudopotential, and JaQMC samples only the remaining valence electrons. Set
`system.ecp` to apply an ECP. A mapping lets you apply it only to selected
elements:

```yaml
system:
ecp:
Li: ccecp
```

The `rock_salt` and `two_atom_chain` shortcuts use `system.ecp` to choose
valence electron counts automatically. If you define `atoms` and
`electron_spins` directly, set `electron_spins` to the valence-electron count per
primitive cell.

See <project:#molecule-ecps> for the broader ECP setup guidance.

(solid-pretrain-reference)=
## Pretrain reference settings

`pretrain.reference.*` configures the PySCF Hartree-Fock calculation used for
pretraining. For most solid runs, the basis is the only reference setting you
need to choose. The default is cc-pVDZ, and you can change it with:

```yaml
pretrain:
reference:
basis: sto-3g
```

If the system uses an ECP, choose a pretrain basis that matches that
pseudopotential:

```yaml
system:
ecp:
Li: ccecp
pretrain:
reference:
basis:
Li: ccecpccpvdz
H: cc-pvdz
```

The available reference settings are shared with molecule runs; see
<project:#molecule-pretrain-reference> for the detailed discussion.

## Supercell Expansion

Expand Down
10 changes: 10 additions & 0 deletions docs/systems/solid/train.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,16 @@ Initializes the neural network to approximate Hartree-Fock orbitals before VMC
training. It uses the same run, sampler, and writer schemas as the train stage,
but with a different optimizer default and a workflow-wired supervised loss.

### Reference (`pretrain.reference.*`)

The Hartree-Fock reference is the PySCF calculation JaQMC uses to generate the
target orbitals for pretraining. Most runs can keep the default settings.

```{eval-rst}
.. config-defaults:: jaqmc.app.solid.config.base.SolidPretrainReferenceConfig
:prefix: pretrain.reference
```

### Run options (`pretrain.run.*`)

```{eval-rst}
Expand Down
4 changes: 2 additions & 2 deletions src/jaqmc/app/molecule/config/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Copyright (c) 2025-2026 Bytedance Ltd. and/or its affiliates
# SPDX-License-Identifier: Apache-2.0

from .base import MoleculeConfig
from .base import MoleculeConfig, MoleculePretrainReferenceConfig

__all__ = ["MoleculeConfig"]
__all__ = ["MoleculeConfig", "MoleculePretrainReferenceConfig"]
Loading
Loading