Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,7 @@ These are currently used to compute properties of a minimum energy conformation
| `OpenFF Optimization Hessians 2019-07 to 2025-03 v4.0` | [2025-04-14-OpenFF-Optimization-Hessians-2019-07-to-2025-03-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-04-14-OpenFF-Optimization-Hessians-2019-07-to-2025-03-v4.0) | Hessian single points for the final molecules in OpenFF optimization datasets from 2019-07 to 2025-03 | S, H, O, Br, F, N, P, Cl, I, C ||
| `OpenFF CX3-CX4 singlepoints v4.0"` | [2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-05-21-OpenFF-CX3-CX4-singlepoints-v4.0) | Single-points of molecules where Sage 2.2.1 torsions t17 and t18 have been driven | Br, C, Cl, F, H, I, N, O, S ||
|`MLPepper RECAP Optimized Fragments v1.1`| [2025-07-01-MLPepper-RECAP-Optimized-Fragments-v1.1](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-07-01-MLPepper-RECAP-Optimized-Fragments-v1.1) | Single point property calculations for charge models, expanded to include iodine | P ,B ,Cl ,Br ,C ,H ,I ,F ,O ,N ,Si ,S | |
| `Curated tmQM-xtb Dataset: T=100K Dataset Restricted to Pd, Zn, Fe, Cu v0.0` | [2025-08-14-tmQM-xtb-Dataset-T=100K-Pd-Zn-Fe-Cu-low-mw-v0.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2025-08-14-tmQM-xtb-Dataset-T=100K-Pd-Zn-Fe-Cu-low-mw-v0.0) | BP86/def2-TZVP Conformers for single metal complexes with Pd, Fe, Zn, Cu, Mg, Li and change of {-1,0,+1}, MW <= 600 Da, generally high coordinate, and 30 geometry samples | Br, C, Cl, Cu, F, Fe, H, N, O, P, Pd, S, Zn ||



Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# tmQM xtb Dataset T=100K Pd Zn Fe Cu low mw v0.0

### Description

This dataset was generated starting from the tmQM dataset (release 13Aug2024, https://github.com/uiocompcat/tmQM).
This dataset contains 6,829 unique systems with 471,097 total configurations/spin states below 600 Da. The molecules are
limited to containing transition metals d, Zn, Fe, or Cu, and also only contain elements C, H, P, S, O, N, F, Cl,
or Br with charges: {-1,0,+1}. The metal is restricted to greater than four coordination sites, except for Cu and Zn
must be greater than or equal to two. Each molecule was preprocessed using gfn2-xtb, and then a short MD simulation
performed to provide ~30 additional configurations of per molecules at three different spin states, 1, 3, and 5. This
singlepoint dataset was then run with the BP86/def2-TZVP for with those geometries from molecular dynamics using
gfn-xtb. Each configuration is reported with the following properties: 'energy', 'gradient', 'dipole', 'quadrupole',
'wiberg_lowdin_indices', 'mayer_indices', 'lowdin_charges' 'dipole_polarizabilities', 'mulliken_charges'. SMILES
strings where generated from tmos (https://github.com/openforcefield/tmos) when possible. These SMILES strings can be
imported into RDKit for initial visualization, but will not reflect the coordinate geometries presented from tmQm.

### General Information

- Date: 2025-08-14
- Purpose: BP86/def2-TZVP Conformers for single metal complexes with Pd, Fe, Zn, Cu, and change of {-1,0,+1}, MW <= 600 Da, generally high coordinate, and 30 geometry samples
- Dataset Type: singlepoint
- Name: tmQM xtb Dataset T=100K Pd Zn Fe Cu low mw v0.0
- Number of unique molecules: 6,829
- Number of filtered molecules: 0
- Number of Conformers: 471,097
- Number of conformers (min mean max): 30, 68, 88
- Molecular Weight (min mean max): 95 455 600
- Set of charges: -1.0, 0.0, 1.0
- Dataset Submitter: Jennifer A. Clark
- Dataset Curator: Christopher R. Iacovella

### QCSubmit generation pipeline

- `generate-dataset.ipynb`: A python notebook which shows how the dataset was prepared from the input files.

### QCSubmit Manifest

- `generate-dataset.ipynb`
- `environment.yml`: Conda environment file to perform this workflow
- `environment_full.yml`: All installed packages with versions for successful completion of this workflow
- `scaffold.json.bz2`: A compressed json file of the target dataset

### Metadata

* Elements: {'Br', 'C', 'Cl', 'Cu', 'F', 'Fe', 'H', 'N', 'O', 'P', 'Pd', 'S', 'Zn'}
* QC Specifications: BP86/def2-TZVP
* program: psi4
* method: BP86
* basis: def2-TZVP
* driver: gradient
* implicit_solvent: None
* keywords: {}
* maxiter: 500
* SCF Properties:
* dipole
* quadrupole
* wiberg_lowdin_indices
* mayer_indices
* lowdin_charges
* dipole_polarizabilities
* mulliken_charges
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: qca-clean
channels:
- conda-forge
dependencies:
- python=3.11
- numpy
- jupyter
- pandas
- h5py
- periodictable
- qcportal>=0.61
- qcfractal>=0.61
- qcfractalcompute>=0.59
- rdkit>=2025.3.3
- openbabel
- deepdiff
- py3Dmol
- scipy
- networkx
- pip:
- git+https://github.com/openforcefield/tmos.git
- git+https://github.com/MDAnalysis/mdanalysis.git@develop#subdirectory=package # delete after 2.10 is released
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
name: qca-clean
channels:
- conda-forge
dependencies:
- anyio=4.10.0=pyhe01879c_0
- appnope=0.1.4=pyhd8ed1ab_1
- argon2-cffi=25.1.0=pyhd8ed1ab_0
- argon2-cffi-bindings=25.1.0=py311h3696347_0
- arrow=1.3.0=pyhd8ed1ab_1
- asttokens=3.0.0=pyhd8ed1ab_1
- async-lru=2.0.5=pyh29332c3_0
- attrs=25.3.0=pyh71513ae_0
- babel=2.17.0=pyhd8ed1ab_0
- beautifulsoup4=4.13.4=pyha770c72_0
- bleach=6.2.0=pyh29332c3_4
- bleach-with-css=6.2.0=h82add2a_4
- brotli=1.1.0=h5505292_3
- brotli-bin=1.1.0=h5505292_3
- brotli-python=1.1.0=py311h155a34a_3
- bzip2=1.0.8=h99b78c6_7
- c-ares=1.34.5=h5505292_0
- ca-certificates=2025.8.3=hbd8a1cb_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cairo=1.18.4=h6a3b0d2_0
- certifi=2025.8.3=pyhd8ed1ab_0
- cffi=1.17.1=py311h3a79f62_0
- chardet=5.2.0=pyhd8ed1ab_3
- charset-normalizer=3.4.3=pyhd8ed1ab_0
- comm=0.2.3=pyhe01879c_0
- contourpy=1.3.3=py311h57a9ea7_1
- cycler=0.12.1=pyhd8ed1ab_1
- cyrus-sasl=2.1.28=ha1cbb27_0
- debugpy=1.8.16=py311ha59bd64_0
- decorator=5.2.1=pyhd8ed1ab_0
- deepdiff=8.6.0=pyhe01879c_0
- defusedxml=0.7.1=pyhd8ed1ab_0
- exceptiongroup=1.3.0=pyhd8ed1ab_0
- executing=2.2.0=pyhd8ed1ab_0
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=h77eed37_3
- fontconfig=2.15.0=h1383a14_1
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- fonttools=4.59.1=py311h2fe624c_0
- fqdn=1.5.1=pyhd8ed1ab_1
- freetype=2.13.3=hce30654_1
- freetype-py=2.3.0=pyhd8ed1ab_0
- greenlet=3.2.4=py311hf719da1_0
- h11=0.16.0=pyhd8ed1ab_0
- h2=4.2.0=pyhd8ed1ab_0
- h5py=3.14.0=nompi_py311h8470beb_100
- hdf5=1.14.6=nompi_he65715a_103
- hpack=4.1.0=pyhd8ed1ab_0
- httpcore=1.0.9=pyh29332c3_0
- httpx=0.28.1=pyhd8ed1ab_0
- hyperframe=6.1.0=pyhd8ed1ab_0
- icu=75.1=hfee45f7_0
- idna=3.10=pyhd8ed1ab_1
- importlib-metadata=8.7.0=pyhe01879c_1
- ipykernel=6.30.1=pyh92f572d_0
- ipython=9.4.0=pyhfa0c392_0
- ipython_pygments_lexers=1.1.1=pyhd8ed1ab_0
- ipywidgets=8.1.7=pyhd8ed1ab_0
- isoduration=20.11.0=pyhd8ed1ab_1
- jedi=0.19.2=pyhd8ed1ab_1
- jinja2=3.1.6=pyhd8ed1ab_0
- json5=0.12.1=pyhd8ed1ab_0
- jsonpointer=3.0.0=py311h267d04e_1
- jsonschema=4.25.0=pyhe01879c_0
- jsonschema-specifications=2025.4.1=pyh29332c3_0
- jsonschema-with-format-nongpl=4.25.0=he01879c_0
- jupyter=1.1.1=pyhd8ed1ab_1
- jupyter-lsp=2.2.6=pyhe01879c_0
- jupyter_client=8.6.3=pyhd8ed1ab_1
- jupyter_console=6.6.3=pyhd8ed1ab_1
- jupyter_core=5.8.1=pyh31011fe_0
- jupyter_events=0.12.0=pyh29332c3_0
- jupyter_server=2.16.0=pyhe01879c_0
- jupyter_server_terminals=0.5.3=pyhd8ed1ab_1
- jupyterlab=4.4.6=pyhd8ed1ab_0
- jupyterlab_pygments=0.3.0=pyhd8ed1ab_2
- jupyterlab_server=2.27.3=pyhd8ed1ab_1
- jupyterlab_widgets=3.0.15=pyhd8ed1ab_0
- kiwisolver=1.4.9=py311h63e5c0c_0
- krb5=1.21.3=h237132a_0
- lark=1.2.2=pyhd8ed1ab_1
- lcms2=2.17=h7eeda09_0
- lerc=4.0.0=hd64df32_1
- libaec=1.1.4=h51d1e36_0
- libblas=3.9.0=34_h10e41b3_openblas
- libboost=1.86.0=hc9fb7c5_3
- libboost-python=1.86.0=py311h8fc16d6_3
- libbrotlicommon=1.1.0=h5505292_3
- libbrotlidec=1.1.0=h5505292_3
- libbrotlienc=1.1.0=h5505292_3
- libcblas=3.9.0=34_hb3479ef_openblas
- libcurl=8.14.1=h73640d1_0
- libcxx=20.1.8=hf598326_1
- libdeflate=1.24=h5773f1b_0
- libedit=3.1.20250104=pl5321hafb1f1b_0
- libev=4.33=h93a5062_2
- libexpat=2.7.1=hec049ff_0
- libffi=3.4.6=h1da3d7d_1
- libfreetype=2.13.3=hce30654_1
- libfreetype6=2.13.3=h1d14073_1
- libgfortran=15.1.0=hfdf1602_0
- libgfortran5=15.1.0=hb74de2c_0
- libglib=2.84.3=h587fa63_0
- libiconv=1.18=h23cfdf5_2
- libintl=0.25.1=h493aca8_0
- libjpeg-turbo=3.1.0=h5505292_0
- liblapack=3.9.0=34_hc9a63f6_openblas
- liblzma=5.8.1=h39f12f2_2
- libnghttp2=1.64.0=h6d7220d_0
- libntlm=1.8=h5505292_0
- libopenblas=0.3.30=openmp_h60d53f8_1
- libpng=1.6.50=h280e0eb_1
- libpq=17.6=h6846fd6_0
- librdkit=2025.03.5=hafd8b29_0
- libsodium=1.0.20=h99b78c6_0
- libsqlite=3.50.3=hf8de324_1
- libssh2=1.11.1=h1590b86_0
- libtiff=4.7.0=h025e3ab_6
- libwebp-base=1.6.0=h07db88b_0
- libxcb=1.17.0=hdb1d25a_0
- libxml2=2.13.8=h4a9ca0c_1
- libzlib=1.3.1=h8359307_2
- llvm-openmp=20.1.8=hbb9b287_1
- markupsafe=3.0.2=py311h4921393_1
- matplotlib-base=3.10.5=py311h66dac5a_0
- matplotlib-inline=0.1.7=pyhd8ed1ab_1
- mistune=3.1.3=pyh29332c3_0
- munkres=1.1.4=pyhd8ed1ab_1
- nbclient=0.10.2=pyhd8ed1ab_0
- nbconvert-core=7.16.6=pyh29332c3_0
- nbformat=5.10.4=pyhd8ed1ab_1
- ncurses=6.5=h5e97a16_3
- nest-asyncio=1.6.0=pyhd8ed1ab_1
- networkx=3.5=pyhe01879c_0
- notebook=7.4.5=pyhd8ed1ab_0
- notebook-shim=0.2.4=pyhd8ed1ab_1
- numpy=2.3.2=py311h0856f98_0
- openbabel=3.1.1=py311h292ccdb_9
- openjpeg=2.5.3=h889cd5d_1
- openldap=2.6.10=hbe55e7a_0
- openssl=3.5.2=he92f556_0
- orderly-set=5.5.0=pyhe01879c_0
- overrides=7.7.0=pyhd8ed1ab_1
- packaging=25.0=pyh29332c3_1
- pandas=2.3.1=py311hff7e5bb_0
- pandocfilters=1.5.0=pyhd8ed1ab_0
- parso=0.8.4=pyhd8ed1ab_1
- pcre2=10.45=ha881caa_0
- periodictable=1.7.1=pyhd8ed1ab_0
- pexpect=4.9.0=pyhd8ed1ab_1
- pickleshare=0.7.5=pyhd8ed1ab_1004
- pillow=11.3.0=py311hb9ba9e9_0
- pip=25.2=pyh8b19718_0
- pixman=0.46.4=h81086ad_1
- platformdirs=4.3.8=pyhe01879c_0
- prometheus_client=0.22.1=pyhd8ed1ab_0
- prompt-toolkit=3.0.51=pyha770c72_0
- prompt_toolkit=3.0.51=hd8ed1ab_0
- psutil=7.0.0=py311h917b07b_0
- pthread-stubs=0.4=hd74edd7_1002
- ptyprocess=0.7.0=pyhd8ed1ab_1
- pure_eval=0.2.3=pyhd8ed1ab_1
- py3dmol=2.5.2=pyhd8ed1ab_0
- pycairo=1.28.0=py311h8a0deb1_0
- pycparser=2.22=pyh29332c3_1
- pygments=2.19.2=pyhd8ed1ab_0
- pyobjc-core=11.1=py311hf0763de_0
- pyobjc-framework-cocoa=11.1=py311hab620ed_0
- pyparsing=3.2.3=pyhe01879c_2
- pysocks=1.7.1=pyha55dd90_7
- python=3.11.13=hc22306f_0_cpython
- python-dateutil=2.9.0.post0=pyhe01879c_2
- python-fastjsonschema=2.21.1=pyhd8ed1ab_0
- python-json-logger=2.0.7=pyhd8ed1ab_0
- python-tzdata=2025.2=pyhd8ed1ab_0
- python_abi=3.11=8_cp311
- pytz=2025.2=pyhd8ed1ab_0
- pyyaml=6.0.2=py311h4921393_2
- pyzmq=27.0.1=py311h2637eca_0
- qhull=2020.2=h420ef59_5
- rdkit=2025.03.5=py311h1da7121_0
- readline=8.2=h1d1bf99_2
- referencing=0.36.2=pyh29332c3_0
- reportlab=4.4.1=py311h917b07b_0
- requests=2.32.4=pyhd8ed1ab_0
- rfc3339-validator=0.1.4=pyhd8ed1ab_1
- rfc3986-validator=0.1.1=pyh9f0ad1d_0
- rfc3987-syntax=1.1.0=pyhe01879c_1
- rlpycairo=0.2.0=pyhd8ed1ab_0
- rpds-py=0.27.0=py311h1c3fc1a_0
- scipy=1.16.1=py311hffedffa_0
- send2trash=1.8.3=pyh31c8845_1
- setuptools=80.9.0=pyhff2d567_0
- six=1.17.0=pyhe01879c_1
- sniffio=1.3.1=pyhd8ed1ab_1
- soupsieve=2.7=pyhd8ed1ab_0
- sqlalchemy=2.0.43=py311h3696347_0
- stack_data=0.6.3=pyhd8ed1ab_1
- terminado=0.18.1=pyh31c8845_0
- tinycss2=1.4.0=pyhd8ed1ab_0
- tk=8.6.13=h892fb3f_2
- tomli=2.2.1=pyhe01879c_2
- tornado=6.5.2=py311h3696347_0
- traitlets=5.14.3=pyhd8ed1ab_1
- types-python-dateutil=2.9.0.20250809=pyhd8ed1ab_0
- typing-extensions=4.14.1=h4440ef1_0
- typing_extensions=4.14.1=pyhe01879c_0
- typing_utils=0.1.0=pyhd8ed1ab_1
- tzdata=2025b=h78e105d_0
- unicodedata2=16.0.0=py311h917b07b_0
- uri-template=1.3.0=pyhd8ed1ab_1
- urllib3=2.5.0=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_1
- webcolors=24.11.1=pyhd8ed1ab_0
- webencodings=0.5.1=pyhd8ed1ab_3
- websocket-client=1.8.0=pyhd8ed1ab_1
- wheel=0.45.1=pyhd8ed1ab_1
- widgetsnbextension=4.0.14=pyhd8ed1ab_0
- xorg-libxau=1.0.12=h5505292_0
- xorg-libxdmcp=1.1.5=hd74edd7_0
- yaml=0.2.5=h925e9cb_3
- zeromq=4.3.5=hc1bb282_7
- zipp=3.23.0=pyhd8ed1ab_0
- zstandard=0.23.0=py311h917b07b_2
- zstd=1.5.7=h6491c7d_2
- pip:
- MDAnalysis==2.10.0.dev0
- qcarchivetesting==0.62.post11+g5735f6503
- qcfractal==0.62.post11+g5735f6503
- qcfractalcompute==0.62.post11+g5735f6503
- qcportal==0.62.post11+g5735f6503
- tmos==1.0.0+33.g2b4f7f8

prefix: "/Users/jenniferclark/mamba/envs/qca-clean"
Loading
Loading