-
Notifications
You must be signed in to change notification settings - Fork 68
Description
Summary
MEEKO v0.6.1 generates invalid PDBQT file for one specific bridged bicyclic compound (LIBD-002), containing pseudo-atom types "CG0" and "G0" that are not recognized by AutoDock/QuickVina.
Environment
- MEEKO version: 0.6.1
- RDKit version: (from conda drd_tools env)
- Python version: 3.11
- Platform: Linux (SDSC Expanse HPC)
Problem Description
When converting LIBD-002 from SMILES to PDBQT format, MEEKO successfully generates a file but includes invalid atom types "CG0" and "G0". These pseudo-atoms appear to be internal representations for ring closure constraints that were not converted back to standard AutoDock atom types before export.
Impact: QuickVina and AutoDock Vina reject the PDBQT file during docking.
Affected Molecule
Compound: LIBD-002
SMILES: CN1CCc2c(c(c(cc2C(C1)c1cccc(c1)C)O)O)Cl
Structure Type: Bridged bicyclic system with N-alkyl bridge
Ring System: 7-membered macrocycle fused with 6-membered aromatic ring
This structure is chemically valid and passes all RDKit validation checks. The issue appears specific to MEEKO's handling of bridged bicyclic topology.
Minimal Reproducible Example
from rdkit import Chem
from rdkit.Chem import AllChem
from meeko import MoleculePreparation
# LIBD-002
smiles = "CN1CCc2c(c(c(cc2C(C1)c1cccc(c1)C)O)O)Cl"
# Generate 3D structure
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)
AllChem.EmbedMolecule(mol, randomSeed=42)
AllChem.UFFOptimizeMolecule(mol)
# Convert to PDBQT with MEEKO
preparator = MoleculePreparation()
preparator.prepare(mol)
pdbqt_string = preparator.write_pdbqt_string()
# Check for invalid atom types
if 'CG0' in pdbqt_string or ' G0' in pdbqt_string:
print("❌ Invalid pseudo-atoms found in PDBQT output")
else:
print("✓ Valid PDBQT generated")Expected: Valid PDBQT with only standard AutoDock atom types
Actual: PDBQT contains "CG0" and "G0" pseudo-atoms
PDBQT Output (Problematic Section)
Line 15: ATOM 5 C UNL 1 -2.349 0.911 -2.117 1.00 0.00 0.089 CG0
Line 16: ATOM 6 G UNL 1 -3.095 0.015 -1.122 1.00 0.00 0.000 G0
Line 29: ATOM 14 C UNL 1 -3.095 0.015 -1.122 1.00 0.00 0.053 CG0
Line 30: ATOM 15 G UNL 1 -2.349 0.911 -2.117 1.00 0.00 0.000 G0
Valid AutoDock atom types: H, HD, HS, C, A, N, NA, NS, OA, OS, F, Mg, P, SA, S, Cl, Ca, Mn, Fe, Zn, Br, I
Error from QuickVina
Parse error on line 15 in file "LIBD-002.pdbqt":
ATOM syntax incorrect: "CG0" is not a valid AutoDock type.
Note that AutoDock atom types are case-sensitive.
Analysis
Why this structure triggers the bug:
- Bridged bicyclic system (not simple fused rings)
- N-methyl bridge creates unusual connectivity pattern
- Complex stereochemistry at ring junctions
- MEEKO's ring perception algorithm appears to get confused
- Pseudo-atoms used for ring closure remain in final output
Evidence this is a MEEKO issue:
- 80 other structurally diverse compounds converted successfully
- Only LIBD-002 fails in our dataset of 81 compounds
- The molecule is chemically valid (passes RDKit checks)
- Pseudo-atoms should be internal-only, not exported
Suggested Fix
MEEKO should ensure all pseudo-atoms used for internal topology representation are converted back to standard AutoDock atom types before writing the final PDBQT string. This conversion may require special handling for bridged bicyclic systems.
Workaround
For our pipeline, we're excluding this compound and documenting it as a known MEEKO limitation with bridged bicyclic scaffolds.
Related Information
- This appears to be an edge case in macrocycle handling
- May affect other bridged systems with similar topology
- Could impact users working with bridged natural products or synthetic macrocycles
Additional context: Happy to provide more details, the full PDBQT file, or test alternative structures if helpful for debugging.