-
Notifications
You must be signed in to change notification settings - Fork 2
Adding GenJets to LST input #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
slava77
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I mentioned during the meeting, it would help to fill the closest jet dr per sim_ (and per trk_ ) in this PR rather than to rely on the post-processing script
| #include "DataFormats/HepMCCandidate/interface/GenParticle.h" | ||
| #include "DataFormats/JetReco/interface/GenJet.h" | ||
| #include "DataFormats/JetReco/interface/GenJetCollection.h" | ||
| #include "DataFormats/HepMCCandidate/interface/GenParticleFwd.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alpha-order
| std::vector<uint16_t> ph2_clustSize; | ||
| std::vector<size_t> ph2_clustSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this reverts a recent change from @alexandertuna ; please undo
| std::vector<float> genJetPt; | ||
| std::vector<float> genJetEta; | ||
| std::vector<float> genJetPhi; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep the same naming style as other branches genjet_pt,eta,phi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at https://github.com/cms-sw/cmssw/blob/master/DataFormats/JetReco/interface/GenJet.h#L90C11-L99, I think it's worth to add invisibleEnergy, auxiliaryEnergy
| @@ -1473,7 +1489,8 @@ TrackingNtuple::TrackingNtuple(const edm::ParameterSet& iConfig) | |||
| keepEleSimHits_(iConfig.getUntrackedParameter<bool>("keepEleSimHits")), | |||
| saveSimHitsP3_(iConfig.getUntrackedParameter<bool>("saveSimHitsP3")), | |||
| simHitBySignificance_(iConfig.getUntrackedParameter<bool>("simHitBySignificance")), | |||
| parametersDefiner_(iConfig.getUntrackedParameter<edm::InputTag>("beamSpot"), consumesCollector()) { | |||
| parametersDefiner_(iConfig.getUntrackedParameter<edm::InputTag>("beamSpot"), consumesCollector()), | |||
| tok_jets_(consumes<reco::GenJetCollection>(iConfig.getParameter<edm::InputTag>("JetSource"))) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tok_jets_(consumes<reco::GenJetCollection>(iConfig.getParameter<edm::InputTag>("JetSource"))) { | |
| tok_jets_(consumes<reco::GenJetCollection>(iConfig.getParameter<edm::InputTag>("jetSource"))) { |
naming style from https://cms-sw.github.io/cms_coding_rules.html
| edm::Handle<reco::GenJetCollection> genJets; | ||
| iEvent.getByToken(tok_jets_, genJets); | ||
| if (genJets.isValid()) { | ||
| for (unsigned iGenJet = 0; iGenJet < genJets->size(); ++iGenJet) { | ||
| const reco::GenJet& genJet = (*genJets)[iGenJet]; | ||
| genJetPt.push_back(genJet.pt()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| edm::Handle<reco::GenJetCollection> genJets; | |
| iEvent.getByToken(tok_jets_, genJets); | |
| if (genJets.isValid()) { | |
| for (unsigned iGenJet = 0; iGenJet < genJets->size(); ++iGenJet) { | |
| const reco::GenJet& genJet = (*genJets)[iGenJet]; | |
| genJetPt.push_back(genJet.pt()); | |
| auto const& genJets = iEvent.get(tok_jets_); | |
| for (auto const& jet : genJets) { | |
| genjet_pt.push_back(jet.pt()); |
more compact access.
Also, I prefer to not silently skip: jets should be available in every event; if missing there's something wrong with the configuration and this should fail.
| @@ -1043,103 +1068,101 @@ void fillEfficiencySet(int isimtrk, | |||
| const float vtx_z_thresh = 30; | |||
| const float vtx_perp_thresh = 2.5; | |||
|
|
|||
| if (pt > 0 && jet_eta < 140 && jet_eta > -140 && (jet_eta > -999 && deltaEta > -999)) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not sure if I searched well enough) shouldn't we have some jet pt minimum requirement to make the jet-related plots?
I expect it to be configurable with a default (order of or e.g.) 100 GeV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR replaces the previous jet clustering method (from PR #187) with GenJets added directly to the trackingNtuple.root file during its creation. The changes enable comparison of reconstructed tracks to jets for examining duplicate and fake rates.
Key Changes:
- GenJet branches (
genjet_pt,genjet_eta,genjet_phi, etc.) are now added to trackingNtuple.root during creation - ΔR calculations between sim/reco tracks and their closest GenJet are computed and stored
- Plotting infrastructure updated to support efficiency, duplicate rate, and fake rate vs. ΔR plots
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| Validation/RecoTrack/plugins/TrackingNtuple.cc | Adds GenJet collection reading, new branches for GenJet properties and sim track ΔR values, and implements fillGenJets() method to compute ΔR between sim tracks and closest jets |
| RecoTracker/LSTCore/standalone/efficiency/src/performance.h | Updates function parameter names from jet_eta/jet_phi/jet_pt to genJetEta/genJetPhi/genJetPt for consistency |
| RecoTracker/LSTCore/standalone/efficiency/src/performance.cc | Renames jet-related branches to genJet naming, adds ΔR computation for reconstructed tracks, creates deltaR histograms for duplicate/fake rate analysis, and removes previous jet-related conditional logic |
| RecoTracker/LSTCore/standalone/efficiency/python/lst_plot_performance.py | Updates variable choices and plotting labels from jet_* to genJet* naming convention, adds deltaR to additional metric plots |
| RecoTracker/LSTCore/standalone/code/core/write_lst_ntuple.cc | Updates branch names from sim_jet_* to genJet* for consistency with new naming convention |
| RecoTracker/LSTCore/standalone/analysis/jets/reformat_jets2.py | New script that adds sim_deltaEta, sim_deltaPhi, and sim_deltaR branches to trackingNtuple.root by computing ΔR between sim tracks and GenJets |
| RecoTracker/LSTCore/standalone/analysis/jets/reformat_jets.py | Updates hardcoded file paths (appears to be for testing purposes, may be superseded by reformat_jets2.py) |
Comments suppressed due to low confidence (3)
RecoTracker/LSTCore/standalone/efficiency/src/performance.cc:1568
- The loop variable
ijetis declared but not initialized. This will cause undefined behavior as the loop condition will evaluate an uninitialized variable. The initialization should beijet = 0.
if (std::abs(eta) < ana.eta_cut and pt > ana.pt_cut) {
RecoTracker/LSTCore/standalone/efficiency/src/performance.cc:1503
- The GenJet vectors are being retrieved from the input file inside this function that is called for every reconstructed track. This is inefficient as it repeatedly reads the same data. Consider moving this retrieval outside the loop or to a higher scope where it can be done once per event.
}
if (std::abs(eta) < ana.eta_cut and pt > ana.pt_cut) {
ana.tx.pushbackToBranch<float>(category_name + "_dr_denom_phi", phi);
RecoTracker/LSTCore/standalone/efficiency/src/performance.cc:1563
- The GenJet vectors are being retrieved from the input file inside this function that is called for every reconstructed track. This is inefficient as it repeatedly reads the same data. Consider moving this retrieval outside the loop or to a higher scope where it can be done once per event.
ana.tx.pushbackToBranch<float>(category_name + "_fdr_numer_eta", eta);
}
if (std::abs(eta) < ana.eta_cut) {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std::vector<float> genJetEta = lstEff.getVF("genJetEta"); | ||
| std::vector<float> genJetPhi = lstEff.getVF("genJetPhi"); | ||
|
|
||
| float dEtaj = 999; | ||
| float dPhij = 999; | ||
| float dRj = 999; | ||
| float dRTemp = 999; | ||
| float genJetPtj; | ||
|
|
||
| for (unsigned int ijet; ijet < genJetPhi.size(); ++ijet) { | ||
| genJetPtj = genJetPt[ijet]; | ||
| if(genJetPtj > 100) { |
Copilot
AI
Jan 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ΔR calculation logic is duplicated across three functions (fillFakeRateSet, fillDuplicateRateSet, fillFakeOrDuplicateRateSet). This duplicated code makes maintenance harder and increases the risk of inconsistencies. Consider extracting this logic into a helper function that computes the minimum ΔR to the closest GenJet.
| } | ||
| if (ana.jet_branches) { | ||
| std::vector<float> genJetPt = lstEff.getVF("genJetPt"); |
Copilot
AI
Jan 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GenJet vectors are being retrieved from the input file inside this function that is called for every reconstructed track. This is inefficient as it repeatedly reads the same data. Consider moving this retrieval outside the loop or to a higher scope where it can be done once per event.
| float dRTemp = 999; | ||
| for (auto const& jet : genJets) { | ||
| dEtaj = sim_eta - jet.eta(); | ||
| dPhij = std::acos(std::cos(sim_phi - jet.phi())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dPhij = std::acos(std::cos(sim_phi - jet.phi())); | |
| dPhij = reco::deltaPhi(sim_phi, jet.phi()); |
| dRj = std::sqrt(std::pow(dEtaj,2) + std::pow(dPhij,2)); | ||
| if(dRj < dRTemp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dRj = std::sqrt(std::pow(dEtaj,2) + std::pow(dPhij,2)); | |
| if(dRj < dRTemp) { | |
| dRj2 = std::pow(dEtaj,2) + std::pow(dPhij,2); | |
| if(dRj2 < dRTemp) { |
then below call sqrt only when filling the sim_deltaR
| t->Branch("sim_deltaEta", &sim_deltaEta); | ||
| t->Branch("sim_deltaPhi", &sim_deltaPhi); | ||
| t->Branch("sim_deltaR", &sim_deltaR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to where other sim_ branches are for clarity
| std::vector<float> genjet_phi; | ||
| std::vector<float> genjet_invisible_energy; | ||
| std::vector<float> genjet_auxiliary_energy; | ||
| std::vector<float> sim_deltaEta; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::vector<float> sim_deltaEta; | |
| std::vector<float> sim_genJetDEta; |
or
| std::vector<float> sim_deltaEta; | |
| std::vector<float> sim_genjet_deltaEta; |
for clarity
| sim_deltaEta.push_back(dEtaTemp); | ||
| sim_deltaPhi.push_back(dPhiTemp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of these it would be more useful to save the index of the genjet
| jet_eta.at(isimtrk), | ||
| jet_phi.at(isimtrk), | ||
| jet_pt.at(isimtrk)); | ||
| if (genJetPt.at(isimtrk) > 100) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this is not crashing: IIUC genJetPt is per jet while isimtrk is per sim track and there are many more sim tracks
In PR #187, I added branches to trackingNtuple.root input file after clustering sim tracks into jets using FastJet. However, this method prevents us from comparing reconstructed tracks to the jets, which I need to do in order to examine how the jets affect the duplicate and fake rates. Here I have replaced the former method of jet clustering with GenJets that are added to trackingNtuple.root at the step where it is produced.
These are the pertinent changes I've made:
Validation/RecoTrack/plugins/TrackingNtuple.cc.When the trackingNtuple.root is created with
cmsDriver.py, it now will include the GenJet branches forgenJetPt,genJetEta, andgenJetPhi. I did not include a way to optionally enable/disable this.RecoTracker/LSTCore/standalone/analysis/jets/reformat_jets2.pyto compute ΔR for the sim tracks and add it to trackingNtuple.rootThis completely replaces both reformat_jets.py and myjets.py from PR 187.
RecoTracker/LSTCore/standalone/efficiency/src/performance.ccThis is still enabled/disabled when running LST by the use of the
-jtag.Modified RecoTracker/LSTCore/standalone/code/core/write_lst_ntuple.ccandRecoTracker/LSTCore/standalone/efficiency/python/lst_plot_performance.pyin small ways to reflect these changes.The input file differs from the standard one due to the addition of GenJet branches when it is created and due to the
sim_deltaEta,sim_deltaPhi, andsim_deltaRbranches that are tacked on after. The only change to the output is the inclusion of efficiency, DR, and FR vs. ΔR plots, where ΔR is the distance from a track to its closest GenJet. Otherwise, this PR should not impact the performance of LST at all. The presence of GenJets in the N-tuple also does not affect LST regardless of whether the jet branches are enabled or disabled.