-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
bugSomething isn't workingSomething isn't working
Description
While reviewing the GraphPropertyReader
class, I noticed that the edge_index
in the _read_data
method is built like this:
edge_index = torch.tensor(
[
[bond.GetBeginAtomIdx() for bond in mol.GetBonds()],
[bond.GetEndAtomIdx() for bond in mol.GetBonds()],
]
)
This seems to construct the molecular graph as directed graph, where each bond is added in one direction only [begin_atom_idx] -> [end_atom_idx]
.
❓ Clarification
- Is this directed representation intentional?
- Or is the graph converted to undirected later in the pipeline (perhaps via symmetrization)?
Since molecular graphs typically model bonds as undirected, I wanted to confirm the intended behavior.
💡 Suggestion (if not intentional)
If the goal is an undirected graph, this can be handled cleanly using PyTorch Geometric's to_undirected
utility:
from torch_geometric.utils import to_undirected
edge_index = torch.tensor(
[
[bond.GetBeginAtomIdx() for bond in mol.GetBonds()],
[bond.GetEndAtomIdx() for bond in mol.GetBonds()],
]
)
edge_index = to_undirected(edge_index)
OR
by explicitly creating undirected graphs:
src = [bond.GetBeginAtomIdx() for bond in mol.GetBonds()]
tgt = [bond.GetEndAtomIdx() for bond in mol.GetBonds()]
edge_index = torch.tensor([src + tgt, tgt + src])
Let me know if this was an intentional design or if we should update it. Thanks!
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working