Skip to content

Iodine frag opt #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,7 @@ These are currently used to find a minimum energy conformation of a molecule.
| `OpenFF Torsion Benchmark Supplement Optimization Dataset v1.0` | [2024-04-18-OpenFF-Torsion-Benchmark-Supplement-Optimization-Dataset-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-04-18-OpenFF-Torsion-Benchmark-Supplement-Optimization-Dataset-v1.0) | Additional optimizations for benchmarking Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | H, C, N, O, F, P, S, Cl, Br | |
| `OpenFF Torsion Multiplicity Optimization Training Coverage Supplement v1.0` | [2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0) | Additional optimization training data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | C, Cl, S, O, H, P, N, Br | |
| `OpenFF Torsion Multiplicity Optimization Benchmarking Coverage Supplement v1.0` | [2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0) | Additional optimization benchmarking data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | Cl, H, I, S, O, N, Br, C, P | |
|`OpenFF Iodine Fragment Opt v1.0` | [2024-09-10-OpenFF-Iodine-Fragment-Opt-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-09-10-OpenFF-Iodine-Fragment-Opt-v1.0) | B3LYP-D3BJ/DZVP optimized conformers for a variety of I-containing fragment molecules | C, O, I, S, F, Br, Cl, N, H ||


# TorsionDrive Datasets
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# OpenFF Iodine Fragment Opt v1.0

## Description

A dataset containing fragments of molecules from the Zinc, Enamine 10240, Enamine 50240, and CHEMBL datasets, optimized at the B3LYP-D3BJ/DZVP level of theory. Molecules containing I were fragmented into small fragments.
Fragments containing I then had protomers enumerated.

For each resulting molecule, a set of up to 5 conformers were generated by:
* generating a set of up to 1000 conformers with a RMS cutoff of 0.5 Å using the OpenEye backend of the OpenFF toolkit
* applying ELF conformer selection (max 5 conformers) using OpenEye


## General information

* Date: 2024-09-10
* Class: OpenFF OptimizationDataset
* Purpose: Geometry optimization for eventual ESP calculation
* Name: OpenFF Iodine Fragment Opt v1.0
* Number of unique molecules: 526
* Number of conformers: 531
* Number of conformers per molecule (min, mean, max): 1, 1.01, 5
* Molecular weight (min, mean, max): (155.97, 242.52, 316.94)
* Charges: [-1.0, 0.0, 1.0, 2.0]
* Dataset submitter: Alexandra McIsaac
* Dataset generator: Alexandra McIsaac


## QCSubmit generation pipeline

* `generate-dataset.ipynb`: This notebook shows how the dataset was prepared from the input files.
* Conformers were generated and selected using OpenEye


## QCSubmit Manifest

* `dataset.json.bz2`: Compressed dataset ready for submission
* `dataset.pdf`: Visualization of dataset molecules
* `dataset.smi`: Smiles strings for dataset molecules
* `generate-dataset.ipynb`: Notebook describing dataset generation and submission
* `enumerated.smi`: Input SMILES
* `input-environment.yaml`: Environment file used to create Python environment for the notebook
* `full-environment.yaml`: Fully-resolved environment used to execute the notebook.


## Metadata

* Elements: {C, I, Br, Cl, S, N, H, F, O}
* unique molecules: 526
* Spec: default
* basis: DZVP
* implicit_solvent: None
* keywords: {}
* maxiter: 200
* method: B3LYP-D3BJ
* program: psi4
* SCF properties:
* dipole
* quadrupole
* wiberg_lowdin_indices
* mayer_indices
Git LFS file not shown
Binary file not shown.
Loading
Loading