Replies: 1 comment 5 replies
-
|
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello @KasiaSmietanka,
As previously discussed I updated the pairing rules so that participants with (HF, MI, Stroke, etc) conditions with onset dates that can't be calculated are also included as FHIR resources (the conditions are included with an 'undefined' onset date).
I re-ran the pipeline (the new database is available at
/groups/umcg-lifelines/tmp01/projects/ov22_0581/pheno_lifelines_sqlite/db-lifelines-sep23.db). I also re-run the queries for generating CSV files discriminating incident/prevalent Stroke, MI, and HF. I did this on a new folder to keep the original one for comparison purposes:/groups/umcg-lifelines/tmp01/projects/ov22_0581/pheno_lifelines_sqlite/queries_output_sep23This time I also extracted the participants whose conditions are reported as active, but that can't be identified as prevalent or incident (when working with the harmonized FHIR data), as the corresponding onset time can't be inferred -for example, because the date of the assessment with the '1' in stroke_followup_adu_q_1 is not available- ('xxxx_undefined_osdate.csv'). Now only 35 participants were skipped in total, but now there are more conditions with an 'undefined' onset date.
Now I see that the differences between the results of the queries on the FHIR data, and the scripts that use the original lifelines' raw data, are due to some limitations of the intermediate FHIR representation when there are missing values. For instance, when processing the raw data directly, you can tell that a condition is incident just by checking that there is a '1' on xxxxx_followup_adu_q_1, even if other details are missing. When making the same calculation on the harmonized FHIR data, on the other hand, the only way to determine if a condition is incident or prevalent is by comparing the 'onset' date of the condition with a reference date (in this case, the baseline assessment). Hence, when the onset date of a condition can't be inferred during the harmonization process, the detail of its prevalent/incident can no longer be determined.
Please let me know your thoughts. This may eventually require further discussion with @baukearends depending on which elements we want to include in the analysis/prediction model training. For example, would there be alternative ways to estimate these on-set dates when the assessment date is unavailable?
Beta Was this translation helpful? Give feedback.
All reactions