Skip to content

Conversation

@wdconinc
Copy link
Contributor

Briefly, what does this PR introduce?

This PR adds a datatype to record the "truthiness" (as mathematically defined...) for a reconstructed event; where truthiness is the "quality of seeming or being felt to be true, even if not necessarily true," or in this case also "the amount of confidently proclaiming the wrong thing to be true."

Mathematically, truthiness is a non-negative value that is zero only for perfectly reconstructed events (positive-definite), and is radially increasing in the error of the reconstruction (greater error leads to greater truthiness).

It is possible to define truthiness in multiple ways, but we will typically use some combination of the following components:

  • a χ2 measure on associated reconstructed and generated particles, with normalization given by the determined uncertainty in the reconstruction (if available) or 1 GeV otherwise,
  • a positive penalty term for discrete reconstruction errors, such as PID mis-identification (where weighting can be used to penalize some mis-identification more than others),
  • a positive penalty term for generated particles that should have been reconstructed, but weren't,
  • a positive penalty term for reconstructed particles that were not part of the original event record.

There are non-reconstruction reasons why the truthiness will be non-zero in realistic scenarios:

  • multiple-scattering effects will cause the event to lose momentum starting from the true value, deviating both in direction and magnitude in a consistent direction,
  • secondary particles will be generated in materials or along bent trajectories, leading to additional reconstructed particles corresponding to e.g. hard bremsstrahlung gammas in the electromagnetic calorimeters,
  • primary particles (in particular are low energies) may be absorbed in support structures, leading to their absence in the reconstructed event.

Nevertheless, the decrease of the overall average event truthiness for the same geometry and input hit collections is intended to indicate an improved reconstruction, and converse.

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue: store truthiness for event reconstruction)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No.

Does this PR change default behavior?

No.

@wdconinc wdconinc requested a review from a team as a code owner October 29, 2025 17:12
- edm4eic::MCRecoParticleAssociation associations // Reference to the associated reconstructed particles
- edm4hep::MCParticle unassociated_mc_particles // Reference to the unassociated MC particles
- edm4eic::ReconstructedParticle unassociated_rc_particles // Reference to the unassociated reconstructed particles

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition of the truthiness data type excludes vertex terms. I don't think at this point we are ready to compare generated and reconstructed vertices, and they are not easily accessible through individual associations. It may be possible to have some adhoc relation to a MCParticle mean the vertex where that particle was generated. Still, I think that is a harder problem than this first attempt. One thing to keep in mind in the vertexing problem is that it is hard to define what a missing reconstructed vertex should be since some vertices are going to be so close together as to be effectively unresolvable.

Copy link
Contributor

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I'm really intrigued by this! I'm curious about how this type would get used in practice, so for my own understanding: the vector members should be 1-to-1 with the relations in the associations field, correct?

@wdconinc
Copy link
Contributor Author

wdconinc commented Nov 4, 2025

Interesting! I'm really intrigued by this! I'm curious about how this type would get used in practice, so for my own understanding: the vector members should be 1-to-1 with the relations in the associations field, correct?

It's not intended for analyzers but for reconstruction development (so absolutely not for selecting for your analysis only those events that are close to the truth). But I would imagine we can use this to select events that are particularly poorly reconstructed and look at what went wrong, or compare before and after PRs to make sure we don't make things worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants