Skip to content

Add SparseObservable.to_matrix, migrate from SparsePauliOp#15022

Open
aaryav-3 wants to merge 62 commits intoQiskit:mainfrom
aaryav-3:sparse_observable_tomatrix
Open

Add SparseObservable.to_matrix, migrate from SparsePauliOp#15022
aaryav-3 wants to merge 62 commits intoQiskit:mainfrom
aaryav-3:sparse_observable_tomatrix

Conversation

@aaryav-3
Copy link
Copy Markdown
Contributor

@aaryav-3 aaryav-3 commented Sep 17, 2025

Summary

This PR fixes GitHub issue #13389 by introducing the to_matrix method in SparseObservable, by omitting the usage of PauliList and using Extended Alphabet based Pauli projectors instead for time and space efficiency


Checklist

  • I have added the tests to cover my changes.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.

Summary of Changes

Migrate and Generalize to_matrix from SparsePauliOp to SparseObservable

  • Migrated to_matrix to SparseObservable: The Rust implementation for matrix conversion now resides in SparseObservable, making it a core feature of that class. Functionality is now attained via an intermediate representation called :class:MatrixComputationData, logically equivalent to :class:ZXPaulis but fine tuned for BitTerms.
  • Added Projector Support: Now correctly handles single-character shorthands (from Extended Alphabet) for Pauli projectors (e.g., '0' for |0⟩⟨0|) by expanding them into their equivalent Pauli sums with the correct complex phases, uses :meth:as_paulis for non-paulis currently, however pure paulis are handled via direct bit manipulations
  • Added Rust Unit Tests: A new set of unit tests verifies the consistency between the serial and parallel execution paths of the new methods.

Benchmark Performance Report:

Sparse Matrix - Parallel (sparse=True, force_serial=False):

  • 10q/100t: 0.0146s vs 0.0142s → 0.97x (neutral)
  • 15q/250t: 0.3073s vs 0.2777s → 1.11x slower ⚠️
  • 20q/500t: 3.7603s vs 6.6647s → 1.77x faster

Sparse Matrix - Serial (sparse=True, force_serial=True):

  • 10q/100t: 0.0047s vs 0.0259s → 5.49x faster
  • 15q/250t: 0.3480s vs 1.4627s → 4.20x faster
  • 20q/500t: 22.3129s vs 67.0121s → 3.00x faster

Dense Matrix Performance:

  • 5q/50t (32×32 matrix): 0.0004s vs 0.0005s → 1.12x faster
  • 8q/100t (256×256 matrix): 0.0005s vs 0.0005s → 1.06x faster
  • 10q/150t (1024×1024 matrix): 0.0021s vs 0.0022s → 1.07x faster
  • 12q/200t (4096×4096 matrix): 0.0176s vs 0.0159s → 1.11x slower ⚠️
  • 14q/250t (16384×16384 matrix): 0.1159s vs 0.1227s → 1.06x faster

Feature branch shows not only comparable results, but a large speedup in all cases. It is especially observed in sparse matrix conversion in the serial case, which is the expected performance boost from BitTerm representation. Further, the parallel case also reaps benefit from the speedup, however, the added conversion and rayon chunking overhead adds a greater difference to the smaller runtimes.

@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Sep 17, 2025
@aaryav-3 aaryav-3 changed the title Sparse observable tomatrix Add SparseObservable.to_matix, migrate from SparsePauliOp Sep 17, 2025
@aaryav-3 aaryav-3 marked this pull request as ready for review September 17, 2025 13:52
@aaryav-3 aaryav-3 requested a review from a team as a code owner September 17, 2025 13:52
@qiskit-bot
Copy link
Copy Markdown
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@eliarbel eliarbel added mod: quantum info Related to the Quantum Info module (States & Operators) Rust This PR or issue is related to Rust code in the repository and removed Community PR PRs from contributors that are not 'members' of the Qiskit repo labels Sep 18, 2025
@jakelishman jakelishman added this to the 2.3.0 milestone Oct 8, 2025
@github-project-automation github-project-automation bot moved this to Ready in Qiskit 2.3 Oct 8, 2025
@ShellyGarion ShellyGarion linked an issue Oct 28, 2025 that may be closed by this pull request
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaryav-3 - thanks for your contribution to qiskit.
Any idea why the CI checks are not running? maybe there is some conflict with the main branch?
In addition, is this code migrated from other parts of qiskit? I didn't see any deletions.
Since the aim of this PR is to optimize the current code, could you pehaps add some performence benchmarks?

y_count += 1;
}
BitTerm::Z => z_bits |= 1u32 << qubit,
_ => panic!("Expected only Pauli terms in precomputation"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to panic here and not raise an error?

Copy link
Copy Markdown
Contributor Author

@aaryav-3 aaryav-3 Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShellyGarion thank you so much for your comments. I have removed the panic and propagated an error in the latest push. As for code deletion and benchmarking, I had mocked up some local testing to prove it while I was writing code. How would you suggest I do the same in our codebase; Should I write a new test case for it for runtime check? or is there somewhere I could report my results? Additionally, in the former case, if I removed deprecated code, I would like to get an idea of how I would still retain a comparative performance benchmark. It'll be great help if you could let me know the process for the same.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnaks @aaryav-3.
You can report your performence benchmark results as a comment in this PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that CI checks Miri & rust tests are failing - could you please handle it?

Copy link
Copy Markdown
Contributor Author

@aaryav-3 aaryav-3 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Shelly, thank you for running the tests and bringing it to my notice. I am now aware of the failing rust tests, and it's been fixed in my recent push. As for the Miri tests, I'm yet to get a better idea for that. Besides this,
in benchmarking, while dense matrix conversions show comparable runtimes, for operators >=20 qubits the .to_matrix(sparse=True) method in my feature branch is ~2× slower than main due to some operational overhead, making the advantage from BitTerm mapping seem invisible. The slowdown stems from my naive port of the impl_to_matrix_sparse! macro from sparse_pauli_op.rs.
In SparseObservable, each BitTerm can expand into a superposition of multiple Paulis, so the deterministic one-to-one row–column mapping used in SparsePauliOp no longer holds. Further, functions like sparse_observable_to_matrix_serial_64 repeatedly recompute and sort row indices (see order.sort_unstable_by, this previously showed almost linear time behavior in the case of SparsePauliOp.to_matrix(sparse=True) due to deterministic XOR permutations) for every row, proving another bottleneck. I’m optimizing this by introducing Gray-code indexing, caching precomputed parity and index flips for related terms, and replacing per-row comparison sorts with in-place radix sorting of CSR rows (to revert back to linear time). These changes should restore near-main-branch performance, I'll push the changes soon, and I'd love to know what you think of them. Thank you very much for your patience!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that rust tests in Miri are still failing, could you look into it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, working on it

@coveralls
Copy link
Copy Markdown

coveralls commented Oct 28, 2025

Pull Request Test Coverage Report for Build 23049574652

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 570 of 728 (78.3%) changed or added relevant lines in 3 files are covered.
  • 169 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.2%) to 87.326%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/quantum_info/src/sparse_pauli_op.rs 64 65 98.46%
crates/quantum_info/src/sparse_observable/mod.rs 503 660 76.21%
Files with Coverage Reduction New Missed Lines %
crates/circuit/src/parameter/parameter_expression.rs 1 86.99%
crates/qasm2/src/lex.rs 3 92.03%
crates/quantum_info/src/rayon_ext.rs 60 0.0%
crates/quantum_info/src/sparse_pauli_op.rs 105 79.63%
Totals Coverage Status
Change from base Build 23023092226: -0.2%
Covered Lines: 101422
Relevant Lines: 116142

💛 - Coveralls

Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @Cryoris - do you have any further comments?

Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnaks for this nice addition. Perhaps it's also worth to add a test that shows that it's done efficiently, based for example on this comment


// Pauli Y
static L_Y_OUT0: [LocalEntry; 1] = [LocalEntry(1, Complex64::new(0.0, -1.0))]; // -i
static L_Y_OUT1: [LocalEntry; 1] = [LocalEntry(0, Complex64::new(0.0, 1.0))]; // +i
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of static is very good, but there are some duplications here:
L_X_OUT0 is the same as L_ONE_OUT1,
L_X_OUT1 is the same as L_Z_OUT0 and L_ZERO_OUT0,
etc.

Is there a simpler way to define them only once?
See e.g.
https://github.com/Qiskit/qiskit/blob/main/crates/circuit/src/util.rs
https://github.com/Qiskit/qiskit/blob/main/crates/circuit/src/gate_matrix.rs

@Cryoris
Copy link
Copy Markdown
Collaborator

Cryoris commented Dec 9, 2025

Moving to 2.4 since we need more time to review the new projector handling

@Cryoris Cryoris modified the milestones: 2.3.0, 2.4.0 Dec 9, 2025
@github-project-automation github-project-automation bot moved this to Ready in Qiskit 2.4 Dec 9, 2025
@Cryoris Cryoris removed this from Qiskit 2.3 Dec 9, 2025
floating-point associativity effects."""
# Using powers-of-two coefficients to make floating-point arithmetic associative so we can
# do bit-for-bit assertions. Choose labels that have at least few overlapping locations.
labels = ["XZIXYX", "YIIYXY", "ZZZIIZ", "IIIIII"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test_to_matrix_parallel_vs_serial, the test checks that the serial and parallel implementations of to_matrix produce exactly the same matrix (no element differs at all).

Is this exact match something you intend to guarantee for all observables, or is it mainly a sanity check that both paths are doing the same kind of work?

Copy link
Copy Markdown
Contributor Author

@aaryav-3 aaryav-3 Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, this is an exact match that I intend to guarantee for all observables. The resulting matrix with or without parallelization should be the same. Thank you for your review!

Copy link
Copy Markdown
Member

@alexanderivrii alexanderivrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @aaryav-3, for working on this. I realize that this was a fairly complex piece of development, and I appreciate the effort that you have put into this.

It is also a bit challenging to review this. If I understand correctly, there is no paper that describes this extension for matrix construction from SparsePauliOp to SparseObservable. Is there somewhere a description/documentation of this extension?

I would definitely appreciate some general description of the idea, and better documentation of the new structures and functions that you introduced. For example, what are L_PLUS_OUT0 (and others), what are different fields in TermDecomp, etc.?

If I recall correctly, the code for SparsePauliOp modified the matrix in-place, is this also done here?

let local = local_for_bitterm(bt).unwrap(); // safe: all BitTerms have Local2x2
qubits.push(q);
locals.push(local);
mask |= 1u64 << q;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this panic when q > 64?

Comment on lines +550 to +558
// Compact compiled form of a SparseObservable term
#[derive(Clone, Debug)]
struct TermDecomp {
qubits: Vec<u32>,
mask: u64,
dest_offsets: Vec<u64>,
locals: Vec<&'static Local2x2>,
coeff: Complex64,
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't immediately figure out what's going on: is there a reference to a paper/text that describes this extension to sparse observables?

Could you please add docstring comments explaining the fields in the structure (e.g. what are mask, dest_offset, etc.)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the comments in my latest commit. Thank you!

/// Done for runtime optimization before matrix construction.
/// This deduplicates equivalent Pauli terms within the SparseObservable,
/// ensuring each unique (X, Z) mask pair appears only once.
fn combine_duplicates(&mut self) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use SparseObservable.canonicalize instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, thank you so much for pointing out to this API, it indeed fits in place with the current implementation, and a new function is not needed. I have addressed this in my latest commit.

Comment on lines +3296 to +3298
///
/// :meth:to_matrix Convert the observable to a dense NumPy array or a sparse
/// CSR matrix.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be to_matrix_sparse and to_matrix_dense rather than to_matrix?

Comment on lines +1016 to +1018
// ---------------------------------------------------------------------------
// PauliSparseObservable; CSR matrix conversion (Pure Pauli case)
// ---------------------------------------------------------------------------
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to consider the pure Pauli case separately - is this a performance optimization or something else?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, it is a performance optimization retained from legacy implementation

Comment on lines +406 to +408
// ---------------------------------------------------------------------------
// Local 2x2 operator tables for BitTerms (Paulis + projectors)
// ---------------------------------------------------------------------------
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe the intuition behind these local tables?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comments documenting their role in the latest commit. I do believe, with a major refactor like this, we need to document it exhaustively. Thank you for bringing this up!

@aaryav-3 aaryav-3 mentioned this pull request Mar 12, 2026
3 tasks
@alexanderivrii alexanderivrii modified the milestones: 2.4.0, 2.5.0 Mar 18, 2026
@gadial gadial self-assigned this Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: Added Add an "Added" entry in the GitHub Release changelog. mod: quantum info Related to the Quantum Info module (States & Operators) Rust This PR or issue is related to Rust code in the repository

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

Implement SparseObservable.to_matrix

10 participants