Skip to content

Conversation

@corneliusroemer
Copy link
Member

  • feat(phylo/clade-i): Lift OPG gene names from all-clades build to clade I reference
  • Switch off timetree for clade I as it doesn't work given the range of rates
  • Use accession not accessionVersion for stability
  • Remove versions from known duplicates
  • parallelized deduplicate
  • weights
  • Better masking of clade I issues in all-clades tree
  • More fine grained sampling and exclusions
  • Few more changes
  • Fix ambiguous date format for augur filter
  • Add more known duplicates
  • Add more known duplicates
  • Use masked alignment for refine (branch lengths)
  • Fix build url, thanks @chaoran-chen for the spot

Description of proposed changes

Related issue(s)

Checklist

  • Checks pass
  • Update changelog

Comment on lines +73 to +74
sequences_url="https://lapis.pathoplexus.org/mpox/sample/unalignedNucleotideSequences?downloadAsFile=true&downloadFileBasename=mpox_nuc_2025-03-19T1422&versionStatus=LATEST_VERSION&isRevocation=false&dataFormat=fasta&compression=zstd",
metadata_url="https://lapis.pathoplexus.org/mpox/sample/details?downloadAsFile=true&downloadFileBasename=mpox_metadata_2025-03-19T1422&versionStatus=LATEST_VERSION&isRevocation=false&dataFormat=tsv&compression=zstd",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question, not review]

Thinking about how we would use ppx in mpox for canonical ingest, and assuming that all NCBI data is in ppx (?), we would be dropping the fetch_from_ncbi.smk code and fetching TSV & FASTA from an API call similar to these lines. We'd then (?) convert these to a data/ppx.ndjson structure and curate the data as normal. Is this about right? Is this something we should be doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[also question, but also kinda review]

How often is this param expected to change? downloadFileBasename=mpox_metadata_2025-03-19T1422 And is that param always expected to be the same between sequences_url and metadata_url? I would maybe be inclined to pull that out into a single param and then interpolate into the URL unless it is exceptionally stable…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://github.com/nextstrain/rsv/pull/87/files has answered my question here

Comment on lines +73 to +74
sequences_url="https://lapis.pathoplexus.org/mpox/sample/unalignedNucleotideSequences?downloadAsFile=true&downloadFileBasename=mpox_nuc_2025-03-19T1422&versionStatus=LATEST_VERSION&isRevocation=false&dataFormat=fasta&compression=zstd",
metadata_url="https://lapis.pathoplexus.org/mpox/sample/details?downloadAsFile=true&downloadFileBasename=mpox_metadata_2025-03-19T1422&versionStatus=LATEST_VERSION&isRevocation=false&dataFormat=tsv&compression=zstd",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[also question, but also kinda review]

How often is this param expected to change? downloadFileBasename=mpox_metadata_2025-03-19T1422 And is that param always expected to be the same between sequences_url and metadata_url? I would maybe be inclined to pull that out into a single param and then interpolate into the URL unless it is exceptionally stable…

rule join_metadata:
input:
metadata="data/metadata.tsv",
stats=rules.filter_nextclade_results.output.stats,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata="results/decent_metadata_raw.tsv",
output:
metadata="results/decent_metadata.tsv",
run:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# --output {output.tree}
# """
"""
~/code/pree/rust/target/release/rust_parsimony build \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noting this as a merge blocker.

- dnachun
dependencies:
- augur
- augur <= 27 # Augur 27-29.0 have issues with ancestral reconstruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there specific bug reports to link to here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants