Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,8 +261,9 @@ These are currently used to compute properties of a minimum energy conformation
| `Curated tmQM-xtb Dataset: T=100K Dataset Restricted to Pd, Zn, Fe, Cu v0.0` | [2025-03-17-Curated-tmQM-xtb-Dataset-T=100K-Dataset-Restricted-to-Pd-Zn-Fe-Cu-v0.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-03-17-Curated-tmQM-xtb-Dataset-T=100K-Dataset-Restricted-to-Pd-Zn-Fe-Cu-v0.0) | BP86/def2-TZVP Conformers for single metal complexes with Pd, Fe, Zn, Cu, Mg, Li and change of {-1,0,+1} | Br, C, Cl, Cu, F, Fe, H, N, O, P, Pd, S, Zn ||
| `OpenFF Cresset Additional Coverage Hessian v4.0` | [2025-03-31-OpenFF-Cresset-Additional-Coverage-Hessian-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-03-31-OpenFF-Cresset-Additional-Coverage-Hessian-v4.0) | Hessian single points for the final molecules in the [OpenFF Cresset Additional Coverage Optimizations v4.0 dataset](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-03-06-OpenFF-Cresset-Additional-Coverage-Optimizations-v4.0) | O, C, F, S, H, N, Br, Cl ||
| `OpenFF Optimization Hessians 2019-07 to 2025-03 v4.0` | [2025-04-14-OpenFF-Optimization-Hessians-2019-07-to-2025-03-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-04-14-OpenFF-Optimization-Hessians-2019-07-to-2025-03-v4.0) | Hessian single points for the final molecules in OpenFF optimization datasets from 2019-07 to 2025-03 | S, H, O, Br, F, N, P, Cl, I, C ||
| `OpenFF CX3-CX4 singlepoints v4.0"` | [2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0) | Single-points of molecules where Sage 2.2.1 torsions t17 and t18 have been driven | Br, C, Cl, F, H, I, N, O, S ||
| `OpenFF CX3-CX4 singlepoints v4.0` | [2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0) | Single-points of molecules where Sage 2.2.1 torsions t17 and t18 have been driven | Br, C, Cl, F, H, I, N, O, S ||
|`MLPepper RECAP Optimized Fragments v1.1`| [2025-07-01-MLPepper-RECAP-Optimized-Fragments-v1.1](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-07-01-MLPepper-RECAP-Optimized-Fragments-v1.1) | Single point property calculations for charge models, expanded to include iodine | P ,B ,Cl ,Br ,C ,H ,I ,F ,O ,N ,Si ,S | |
| `tmQM xtb Dataset T=100K low-mw high-coordinate mult=1 v0.0` | [2025-08-14-tmQM-xtb-Dataset-T=100K-low-mw-high-coordinate-mult=1-v0.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-08-14-tmQM-xtb-Dataset-T=100K-low-mw-high-coordinate-mult=1-v0.0) | BP86/def2-TZVP Conformers for single metal complexes with Pd, Fe, Zn, Cu, and change of {-1,0,+1} and multiplicity of 1. MW <= 600 Da, generally high coordinate and a max of 31 geometry samples | Br, C, Cl, Cu, F, Fe, H, N, O, P, Pd, S, Zn ||



Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# tmQM xtb Dataset T=100K low-mw high-coordinate mult=1 v0.0

### Description

This dataset was generated starting from an adaptation of the tmQM dataset (https://zenodo.org/records/17042449).
This dataset contains 10,235 unique systems with 306,993 total configurations / spin states below 600 Da. The molecules are
limited to containing transition metals Pd, Zn, Fe, or Cu, and also only contain elements Br, C, H, P, S, O, N, F, Cl,
or Br with charges: {-1,0,+1}. The metal is restricted to greater than three coordination sites for Pd, four for Fe,
and one for Cu and Zn. Each molecule was preprocessed using gfn2-xtb, and then a short MD simulation
performed to provide a maximum of 30 off-optimum configurations in addition to the minimized geometry per molecules at
a multiplicity of 1. This singlepoint dataset was then run with the BP86/def2-TZVP for with those geometries from molecular
dynamics using gfn-xtb. Each configuration is reported with the following properties: 'energy', 'gradient', 'dipole', 'quadrupole',
'wiberg_lowdin_indices', 'mayer_indices', 'lowdin_charges' 'dipole_polarizabilities', 'mulliken_charges'. SMILES
strings where generated from tmos (https://github.com/openforcefield/tmos) when possible. These SMILES strings can be
imported into RDKit for initial visualization, but will not reflect the coordinate geometries presented from tmQm.

### General Information

- Date: 2025-08-14
- Purpose: BP86/def2-TZVP Conformers for single metal complexes with Pd, Fe, Zn, Cu, and change of {-1,0,+1} and multiplicity of 1. MW <= 600 Da, generally high coordinate, and a max of 31 geometry samples
- Dataset Type: singlepoint
- Name: tmQM xtb Dataset T=100K low-mw high-coordinate mult=1 v0.0
- Number of unique molecules: 10,235
- Number of filtered molecules: 0
- Number of Conformers: 306,993
- Number of conformers (min mean max): 3, 30, 31
- Molecular Weight (min mean max): 95 462 600
- Set of charges: -1.0, 0.0, 1.0
- Dataset Submitter: Jennifer A. Clark
- Dataset Curator: Christopher R. Iacovella

### QCSubmit generation pipeline

- `generate-dataset.ipynb`: A python notebook which shows how the dataset was prepared from the input files.

### QCSubmit Manifest

- `generate-dataset.ipynb`
- `environment.yml`: Conda environment file to perform this workflow
- `environment_full.yml`: All installed packages with versions for successful completion of this workflow
- `scaffold.json.bz2`: A compressed json file of the target dataset

### Metadata

* Elements: {'Br', 'C', 'Cl', 'Cu', 'F', 'Fe', 'H', 'N', 'O', 'P', 'Pd', 'S', 'Zn'}
* QC Specifications: BP86/def2-TZVP
* program: psi4
* method: BP86
* basis: def2-TZVP
* driver: gradient
* implicit_solvent: None
* keywords: {}
* maxiter: 500
* SCF Properties:
* dipole
* quadrupole
* wiberg_lowdin_indices
* mayer_indices
* lowdin_charges
* dipole_polarizabilities
* mulliken_charges
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: qca-clean
channels:
- conda-forge
dependencies:
- python=3.11
- numpy
- jupyter
- pandas
- h5py
- periodictable
- qcportal>=0.61
- qcfractal>=0.61
- qcfractalcompute>=0.59
- rdkit>=2025.3.3
- openbabel
- deepdiff
- py3Dmol
- scipy
- networkx
- pip:
- git+https://github.com/openforcefield/tmos.git
- git+https://github.com/MDAnalysis/mdanalysis.git@develop#subdirectory=package # delete after 2.10 is released
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
name: qca-clean
channels:
- conda-forge
dependencies:
- anyio=4.10.0=pyhe01879c_0
- appnope=0.1.4=pyhd8ed1ab_1
- argon2-cffi=25.1.0=pyhd8ed1ab_0
- argon2-cffi-bindings=25.1.0=py311h3696347_0
- arrow=1.3.0=pyhd8ed1ab_1
- asttokens=3.0.0=pyhd8ed1ab_1
- async-lru=2.0.5=pyh29332c3_0
- attrs=25.3.0=pyh71513ae_0
- babel=2.17.0=pyhd8ed1ab_0
- beautifulsoup4=4.13.4=pyha770c72_0
- bleach=6.2.0=pyh29332c3_4
- bleach-with-css=6.2.0=h82add2a_4
- brotli=1.1.0=h5505292_3
- brotli-bin=1.1.0=h5505292_3
- brotli-python=1.1.0=py311h155a34a_3
- bzip2=1.0.8=h99b78c6_7
- c-ares=1.34.5=h5505292_0
- ca-certificates=2025.8.3=hbd8a1cb_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cairo=1.18.4=h6a3b0d2_0
- certifi=2025.8.3=pyhd8ed1ab_0
- cffi=1.17.1=py311h3a79f62_0
- chardet=5.2.0=pyhd8ed1ab_3
- charset-normalizer=3.4.3=pyhd8ed1ab_0
- comm=0.2.3=pyhe01879c_0
- contourpy=1.3.3=py311h57a9ea7_1
- cycler=0.12.1=pyhd8ed1ab_1
- cyrus-sasl=2.1.28=ha1cbb27_0
- debugpy=1.8.16=py311ha59bd64_0
- decorator=5.2.1=pyhd8ed1ab_0
- deepdiff=8.6.0=pyhe01879c_0
- defusedxml=0.7.1=pyhd8ed1ab_0
- exceptiongroup=1.3.0=pyhd8ed1ab_0
- executing=2.2.0=pyhd8ed1ab_0
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=h77eed37_3
- fontconfig=2.15.0=h1383a14_1
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- fonttools=4.59.1=py311h2fe624c_0
- fqdn=1.5.1=pyhd8ed1ab_1
- freetype=2.13.3=hce30654_1
- freetype-py=2.3.0=pyhd8ed1ab_0
- greenlet=3.2.4=py311hf719da1_0
- h11=0.16.0=pyhd8ed1ab_0
- h2=4.2.0=pyhd8ed1ab_0
- h5py=3.14.0=nompi_py311h8470beb_100
- hdf5=1.14.6=nompi_he65715a_103
- hpack=4.1.0=pyhd8ed1ab_0
- httpcore=1.0.9=pyh29332c3_0
- httpx=0.28.1=pyhd8ed1ab_0
- hyperframe=6.1.0=pyhd8ed1ab_0
- icu=75.1=hfee45f7_0
- idna=3.10=pyhd8ed1ab_1
- importlib-metadata=8.7.0=pyhe01879c_1
- ipykernel=6.30.1=pyh92f572d_0
- ipython=9.4.0=pyhfa0c392_0
- ipython_pygments_lexers=1.1.1=pyhd8ed1ab_0
- ipywidgets=8.1.7=pyhd8ed1ab_0
- isoduration=20.11.0=pyhd8ed1ab_1
- jedi=0.19.2=pyhd8ed1ab_1
- jinja2=3.1.6=pyhd8ed1ab_0
- json5=0.12.1=pyhd8ed1ab_0
- jsonpointer=3.0.0=py311h267d04e_1
- jsonschema=4.25.0=pyhe01879c_0
- jsonschema-specifications=2025.4.1=pyh29332c3_0
- jsonschema-with-format-nongpl=4.25.0=he01879c_0
- jupyter=1.1.1=pyhd8ed1ab_1
- jupyter-lsp=2.2.6=pyhe01879c_0
- jupyter_client=8.6.3=pyhd8ed1ab_1
- jupyter_console=6.6.3=pyhd8ed1ab_1
- jupyter_core=5.8.1=pyh31011fe_0
- jupyter_events=0.12.0=pyh29332c3_0
- jupyter_server=2.16.0=pyhe01879c_0
- jupyter_server_terminals=0.5.3=pyhd8ed1ab_1
- jupyterlab=4.4.6=pyhd8ed1ab_0
- jupyterlab_pygments=0.3.0=pyhd8ed1ab_2
- jupyterlab_server=2.27.3=pyhd8ed1ab_1
- jupyterlab_widgets=3.0.15=pyhd8ed1ab_0
- kiwisolver=1.4.9=py311h63e5c0c_0
- krb5=1.21.3=h237132a_0
- lark=1.2.2=pyhd8ed1ab_1
- lcms2=2.17=h7eeda09_0
- lerc=4.0.0=hd64df32_1
- libaec=1.1.4=h51d1e36_0
- libblas=3.9.0=34_h10e41b3_openblas
- libboost=1.86.0=hc9fb7c5_3
- libboost-python=1.86.0=py311h8fc16d6_3
- libbrotlicommon=1.1.0=h5505292_3
- libbrotlidec=1.1.0=h5505292_3
- libbrotlienc=1.1.0=h5505292_3
- libcblas=3.9.0=34_hb3479ef_openblas
- libcurl=8.14.1=h73640d1_0
- libcxx=20.1.8=hf598326_1
- libdeflate=1.24=h5773f1b_0
- libedit=3.1.20250104=pl5321hafb1f1b_0
- libev=4.33=h93a5062_2
- libexpat=2.7.1=hec049ff_0
- libffi=3.4.6=h1da3d7d_1
- libfreetype=2.13.3=hce30654_1
- libfreetype6=2.13.3=h1d14073_1
- libgfortran=15.1.0=hfdf1602_0
- libgfortran5=15.1.0=hb74de2c_0
- libglib=2.84.3=h587fa63_0
- libiconv=1.18=h23cfdf5_2
- libintl=0.25.1=h493aca8_0
- libjpeg-turbo=3.1.0=h5505292_0
- liblapack=3.9.0=34_hc9a63f6_openblas
- liblzma=5.8.1=h39f12f2_2
- libnghttp2=1.64.0=h6d7220d_0
- libntlm=1.8=h5505292_0
- libopenblas=0.3.30=openmp_h60d53f8_1
- libpng=1.6.50=h280e0eb_1
- libpq=17.6=h6846fd6_0
- librdkit=2025.03.5=hafd8b29_0
- libsodium=1.0.20=h99b78c6_0
- libsqlite=3.50.3=hf8de324_1
- libssh2=1.11.1=h1590b86_0
- libtiff=4.7.0=h025e3ab_6
- libwebp-base=1.6.0=h07db88b_0
- libxcb=1.17.0=hdb1d25a_0
- libxml2=2.13.8=h4a9ca0c_1
- libzlib=1.3.1=h8359307_2
- llvm-openmp=20.1.8=hbb9b287_1
- markupsafe=3.0.2=py311h4921393_1
- matplotlib-base=3.10.5=py311h66dac5a_0
- matplotlib-inline=0.1.7=pyhd8ed1ab_1
- mistune=3.1.3=pyh29332c3_0
- munkres=1.1.4=pyhd8ed1ab_1
- nbclient=0.10.2=pyhd8ed1ab_0
- nbconvert-core=7.16.6=pyh29332c3_0
- nbformat=5.10.4=pyhd8ed1ab_1
- ncurses=6.5=h5e97a16_3
- nest-asyncio=1.6.0=pyhd8ed1ab_1
- networkx=3.5=pyhe01879c_0
- notebook=7.4.5=pyhd8ed1ab_0
- notebook-shim=0.2.4=pyhd8ed1ab_1
- numpy=2.3.2=py311h0856f98_0
- openbabel=3.1.1=py311h292ccdb_9
- openjpeg=2.5.3=h889cd5d_1
- openldap=2.6.10=hbe55e7a_0
- openssl=3.5.2=he92f556_0
- orderly-set=5.5.0=pyhe01879c_0
- overrides=7.7.0=pyhd8ed1ab_1
- packaging=25.0=pyh29332c3_1
- pandas=2.3.1=py311hff7e5bb_0
- pandocfilters=1.5.0=pyhd8ed1ab_0
- parso=0.8.4=pyhd8ed1ab_1
- pcre2=10.45=ha881caa_0
- periodictable=1.7.1=pyhd8ed1ab_0
- pexpect=4.9.0=pyhd8ed1ab_1
- pickleshare=0.7.5=pyhd8ed1ab_1004
- pillow=11.3.0=py311hb9ba9e9_0
- pip=25.2=pyh8b19718_0
- pixman=0.46.4=h81086ad_1
- platformdirs=4.3.8=pyhe01879c_0
- prometheus_client=0.22.1=pyhd8ed1ab_0
- prompt-toolkit=3.0.51=pyha770c72_0
- prompt_toolkit=3.0.51=hd8ed1ab_0
- psutil=7.0.0=py311h917b07b_0
- pthread-stubs=0.4=hd74edd7_1002
- ptyprocess=0.7.0=pyhd8ed1ab_1
- pure_eval=0.2.3=pyhd8ed1ab_1
- py3dmol=2.5.2=pyhd8ed1ab_0
- pycairo=1.28.0=py311h8a0deb1_0
- pycparser=2.22=pyh29332c3_1
- pygments=2.19.2=pyhd8ed1ab_0
- pyobjc-core=11.1=py311hf0763de_0
- pyobjc-framework-cocoa=11.1=py311hab620ed_0
- pyparsing=3.2.3=pyhe01879c_2
- pysocks=1.7.1=pyha55dd90_7
- python=3.11.13=hc22306f_0_cpython
- python-dateutil=2.9.0.post0=pyhe01879c_2
- python-fastjsonschema=2.21.1=pyhd8ed1ab_0
- python-json-logger=2.0.7=pyhd8ed1ab_0
- python-tzdata=2025.2=pyhd8ed1ab_0
- python_abi=3.11=8_cp311
- pytz=2025.2=pyhd8ed1ab_0
- pyyaml=6.0.2=py311h4921393_2
- pyzmq=27.0.1=py311h2637eca_0
- qhull=2020.2=h420ef59_5
- rdkit=2025.03.5=py311h1da7121_0
- readline=8.2=h1d1bf99_2
- referencing=0.36.2=pyh29332c3_0
- reportlab=4.4.1=py311h917b07b_0
- requests=2.32.4=pyhd8ed1ab_0
- rfc3339-validator=0.1.4=pyhd8ed1ab_1
- rfc3986-validator=0.1.1=pyh9f0ad1d_0
- rfc3987-syntax=1.1.0=pyhe01879c_1
- rlpycairo=0.2.0=pyhd8ed1ab_0
- rpds-py=0.27.0=py311h1c3fc1a_0
- scipy=1.16.1=py311hffedffa_0
- send2trash=1.8.3=pyh31c8845_1
- setuptools=80.9.0=pyhff2d567_0
- six=1.17.0=pyhe01879c_1
- sniffio=1.3.1=pyhd8ed1ab_1
- soupsieve=2.7=pyhd8ed1ab_0
- sqlalchemy=2.0.43=py311h3696347_0
- stack_data=0.6.3=pyhd8ed1ab_1
- terminado=0.18.1=pyh31c8845_0
- tinycss2=1.4.0=pyhd8ed1ab_0
- tk=8.6.13=h892fb3f_2
- tomli=2.2.1=pyhe01879c_2
- tornado=6.5.2=py311h3696347_0
- traitlets=5.14.3=pyhd8ed1ab_1
- types-python-dateutil=2.9.0.20250809=pyhd8ed1ab_0
- typing-extensions=4.14.1=h4440ef1_0
- typing_extensions=4.14.1=pyhe01879c_0
- typing_utils=0.1.0=pyhd8ed1ab_1
- tzdata=2025b=h78e105d_0
- unicodedata2=16.0.0=py311h917b07b_0
- uri-template=1.3.0=pyhd8ed1ab_1
- urllib3=2.5.0=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_1
- webcolors=24.11.1=pyhd8ed1ab_0
- webencodings=0.5.1=pyhd8ed1ab_3
- websocket-client=1.8.0=pyhd8ed1ab_1
- wheel=0.45.1=pyhd8ed1ab_1
- widgetsnbextension=4.0.14=pyhd8ed1ab_0
- xorg-libxau=1.0.12=h5505292_0
- xorg-libxdmcp=1.1.5=hd74edd7_0
- yaml=0.2.5=h925e9cb_3
- zeromq=4.3.5=hc1bb282_7
- zipp=3.23.0=pyhd8ed1ab_0
- zstandard=0.23.0=py311h917b07b_2
- zstd=1.5.7=h6491c7d_2
- pip:
- MDAnalysis==2.10.0.dev0
- qcarchivetesting==0.62.post11+g5735f6503
- qcfractal==0.62.post11+g5735f6503
- qcfractalcompute==0.62.post11+g5735f6503
- qcportal==0.62.post11+g5735f6503
- tmos==1.0.0+33.g2b4f7f8

prefix: "/Users/jenniferclark/mamba/envs/qca-clean"
Loading
Loading