Skip to content

Conversation

antotu
Copy link

@antotu antotu commented Aug 19, 2025

Description

This PR introduces a Graph Neural Network (GNN) as an alternative to the Random Forest model for predicting the best device to run a quantum circuit.
To support this, the preprocessing pipeline was redesigned: instead of manually extracting features from the circuit, the model now directly takes as input the Directed Acyclic Graph (DAG) representation of the quantum circuit.


🚀 Major Changes

Graph Neural Network Integration

  • Added a GNN model for predicting the target quantum device and estimating the Hellinger distance between output distributions.
  • Added a preprocessing method to transform quantum circuits into DAGs.
  • DAG representation captures gate dependencies and circuit topology for improved graph-based learning.
  • Integrated automated hyperparameter search with Optuna for tuning GNN performance.

🎯 Motivation

  • Previously, features were manually extracted from the quantum circuit, leading to loss of structural information.
  • This new method preserves the full circuit structure by representing it as a graph.
  • GNNs can exploit graph connectivity to make more accurate predictions.
  • Optuna ensures that GNN hyperparameters are efficiently optimized in a reproducible way.

🔧 Fixes and Enhancements

  • Transform input quantum circuits into DAGs, where each node is encoded as a numeric vector.
  • Integrated GNNs as an additional predictor in the pipeline.

📦 Dependency Updates

  • optuna>=4.5.0
  • torch-geometric>=2.6.1

Checklist:

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

@antotu antotu marked this pull request as draft August 19, 2025 16:16

TYPE_CHECKING = False
if TYPE_CHECKING:
VERSION_TUPLE = tuple[int | str, ...]

Check warning

Code scanning / CodeQL

Unreachable code Warning

This statement is unreachable.
@antotu antotu changed the title Gnn branch Add GNN-Based Predictor with DAG Preprocessing Aug 21, 2025
Copy link
Collaborator

@flowerthrower flowerthrower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @antotu , thanks for your continued efforts!
I still didn't manage to get fully through, so here is another preliminary batch of feedback.

# 2) Global pooling
return global_mean_pool(x, batch)

# 3) MLP head
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 3) MLP head

pyproject.toml Outdated
Comment on lines 176 to 177


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

warnings.filterwarnings(
"ignore",
message=r"An issue occurred while importing 'torch-scatter'.*",
category=UserWarning,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytorch.*:UserWarning should already be ignored through the filterwarnings in pyproject.toml. Please only add the additionally required ones there. The same goes for the other files, too. Thanks!

Comment on lines 144 to 146
number_epochs: The number of epochs to train the GNN model. Defaults to 100.
number_trials: The number of trials to run for hyperparameter optimization for the GNN. Defaults to 50.
verbose: Whether to print verbose output during training GNN. Defaults to False.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
number_epochs: The number of epochs to train the GNN model. Defaults to 100.
number_trials: The number of trials to run for hyperparameter optimization for the GNN. Defaults to 50.
verbose: Whether to print verbose output during training GNN. Defaults to False.
**gnn_kwargs: Forwarded to `Predictor.train_gnn_model` when `gnn=True`
(e.g., `number_epochs=100`, `number_trials=50`, `verbose=False`).

Comment on lines 130 to 132
number_epochs: int = 100,
number_trials: int = 50,
verbose: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can use a gnn_kwargs dictionary here to avoid cluttering the arguments with only GNN-specific things. It could also be useful in the future to add more hyperparameters if needed.

antotu and others added 2 commits August 27, 2025 13:14
Copy link
Collaborator

@flowerthrower flowerthrower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just another batch of feedback. Thank you for integrating the requested changes so fast!



def create_dag(qc: QuantumCircuit) -> tuple[torch.Tensor, torch.Tensor, int]:
"""Creates and returns the associate DAG of the quantum circuit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Creates and returns the associate DAG of the quantum circuit.
"""Creates and returns the feature-annotated DAG of the quantum circuit.


from __future__ import annotations

import math
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numpy is already imported and provides the same functionality. If I remember correctly, in a similar way, PyTorch provides these basic things too (perhaps we can further reduce imports here).

return_arrays: bool = False,
verbose: bool = False,
) -> tuple[float, dict[str, float], tuple[np.ndarray, np.ndarray] | None]:
"""Evaluate the models.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the description a bit more detailed? Just so we know why this is necessary and that it is only required for the GNN models.

restore_best: bool = True,
scheduler: torch.optim.lr_scheduler._LRScheduler | None = None,
) -> None:
"""Trains the model with optional early stopping on validation loss.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Trains the model with optional early stopping on validation loss.
"""Trains a GNN model with optional early stopping on validation loss.

qc = QuantumCircuit.from_qasm_file(path_uncompiled_circuit / file)
feature_vec = create_feature_vector(qc)
training_sample = (feature_vec, target_label)
if not self.gnn:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not self.gnn:
if self.gnn:
x, edge_index, number_of_gates = create_dag(qc)
y = torch.tensor([[dev.description for dev in self.devices].index(target_label)], dtype=torch.float)
training_sample = (x, y, edge_index, number_of_gates, target_label)
else:
feature_vec = create_feature_vector(qc)
training_sample = (feature_vec, target_label)
circuit_name = str(file).split(".")[0]
return training_sample, circuit_name, scores_list

Comment on lines 442 to 446
feature_vec = create_feature_vector(qc)
training_sample = (feature_vec, target_label)
circuit_name = str(file).split(".")[0]
return training_sample, circuit_name, scores_list
x, edge_index, number_of_gates = create_dag(qc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
feature_vec = create_feature_vector(qc)
training_sample = (feature_vec, target_label)
circuit_name = str(file).split(".")[0]
return training_sample, circuit_name, scores_list
x, edge_index, number_of_gates = create_dag(qc)

Comment on lines 448 to 459
self.devices_description = [dev.description for dev in self.devices]
y = self.devices_description.index(target_label)
print(target_label)
return Data(
x=x,
y=torch.tensor([y], dtype=torch.float),
circuit_name=circuit_name,
edge_index=edge_index,
target_label=target_label, # torch.tensor([target_label], dtype=torch.float),
scores_list=scores_list,
num_nodes=number_of_gates,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.devices_description = [dev.description for dev in self.devices]
y = self.devices_description.index(target_label)
print(target_label)
return Data(
x=x,
y=torch.tensor([y], dtype=torch.float),
circuit_name=circuit_name,
edge_index=edge_index,
target_label=target_label, # torch.tensor([target_label], dtype=torch.float),
scores_list=scores_list,
num_nodes=number_of_gates,
)


return mdl.best_estimator_

def _get_prepared_training_graphs(self) -> TrainingData:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the changes above, we can drop this graph-specific method and instead use the _get_prepared_training_data with a slight modification when loading the graph-specific training data (if self.gnn: ...).

train_loss = running_loss / max(1, total)
if scheduler is not None:
scheduler.step()
val_loss = float("inf")

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'val_loss' is unnecessary as it is
redefined
before this value is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants