Skip to content

Enhancement/alphabet refactor#125

Merged
junniest merged 13 commits intodevelopfrom
enhancement/alphabet-refactor
Nov 22, 2025
Merged

Enhancement/alphabet refactor#125
junniest merged 13 commits intodevelopfrom
enhancement/alphabet-refactor

Conversation

@junniest
Copy link
Contributor

Refactor alphabets for clarity and fewer memory allocations (alphabets are now always represented as a reference to a static object). Added doctrings to Alphabet methods.

Refactored Alphabet creation to use module name, dna_alphabet() -> Alphabet::dna()
and protein_alphabet() -> Alphabet::protein()
Changed Alphabets to be returned as static refs since there is no reason for them
to be stored in individual models/sequences.
Removed unnecessary `Alphabet::full_set` method as it was only used in tests
Removed Alphabet::empty_freqs due to lack of meaningful interpretation, added Alphabet::len()
instead which can be used to initialise an empty frequencies vector
Renamed Alphabet::gap_encoding() to Alphabet::missing_char_encoding as this is more meaningful
given what the method does.
Additionally char_encoding and missing_char_encoding now return a new type called ConditionalProbs
rather than FreqVector as it did previously to use more meaningful names.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the alphabet system to use static references (&'static Alphabet) instead of owned Alphabet values, reducing memory allocations and providing clearer semantics. The refactor also adds comprehensive docstrings to all public Alphabet methods.

Key changes:

  • Alphabet instances are now accessed via Alphabet::dna() and Alphabet::protein() static methods that return &'static Alphabet
  • The QMatrix::alphabet() and EvoModel::alphabet() methods are now static methods (called as Q::alphabet() rather than instance.alphabet())
  • The alphabet field has been removed from all QMatrix implementations (DNA and protein models) as it's now accessed statically
  • Renamed gap_encoding() to missing_char_encoding() for better semantic clarity
  • Updated all test templates to use Q::alphabet() for retrieving the alphabet from generic type parameters
  • Added comprehensive docstrings with examples for all public Alphabet methods

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated no comments.

Show a summary per file
File Description
phylo/src/alphabets/mod.rs Core refactor: added static dna() and protein() methods, removed full_set field, renamed methods, added extensive docstrings
phylo/src/alphabets/parsimony_set.rs Added Eq derive to ParsimonySet for better comparison support
phylo/src/alphabets/tests.rs Updated tests to use Alphabet::dna() and Alphabet::protein() static methods
phylo/src/alignment/mod.rs Updated documentation examples to use new static alphabet methods
phylo/src/alignment/sequences.rs Changed alphabet field type to &'static Alphabet, moved detect_alphabet to module-level function
phylo/src/alignment/tests.rs Updated tests to use static alphabet methods
phylo/src/substitution_models/mod.rs Changed QMatrix::alphabet() to static method, updated alphabet comparison in builder to use Q::alphabet(), renamed gap_encoding() to missing_char_encoding()
phylo/src/substitution_models/dna_models/mod.rs Removed alphabet field from all DNA model structs, implemented static alphabet() method
phylo/src/substitution_models/protein_models/mod.rs Removed alphabet field from protein model macro, implemented static alphabet() method
phylo/src/substitution_models/tests.rs Updated test templates to remove alphabet parameters and use Q::alphabet() for generic alphabet access
phylo/src/evolutionary_models/mod.rs Changed EvoModel::alphabet() to static method with where Self: Sized constraint
phylo/src/tkf_model/tkf91.rs Updated alphabet comparison to use Q::alphabet()
phylo/src/tkf_model/tkf92.rs Updated alphabet comparison to use Q::alphabet()
phylo/src/tkf_model/tkf_indel.rs Updated test code to use Alphabet::dna()
phylo/src/tkf_model/tests.rs Updated test functions and helpers to use &'static Alphabet and static alphabet methods
phylo/src/pip_model/mod.rs Changed alphabet() implementation to use Q::alphabet()
phylo/src/pip_model/tests.rs Updated test templates to use Q::alphabet() instead of alphabet parameters
phylo/src/phylo_info/mod.rs Replaced alphabet.empty_freqs() with FreqVector::zeros(alphabet.len())
phylo/src/phylo_info/tests.rs Updated tests to use static alphabet methods
phylo/src/phylo_info/phyloinfo_builder.rs Changed alphabet parameter type to &'static Alphabet, updated documentation and comparisons
phylo/src/likelihood/tests.rs Updated test templates to remove alphabet parameters and use Q::alphabet()
phylo/src/parsimony/mod.rs Updated to pass alphabet reference directly instead of dereferencing
phylo/src/parsimony/scoring/model_scoring.rs Added EvoModel bound to trait implementations, removed alphabet field, use P::alphabet() statically
phylo/src/parsimony/cost/dollo_parsimony_cost.rs Updated test to use Alphabet::dna()
phylo/src/optimisers/blen_optimiser_tests.rs Updated test to use Alphabet::protein()
phylo/src/io/mod.rs Updated to use Alphabet::protein() for validation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 20, 2025

Test Results

587 tests  +11   586 ✅ +11   26m 43s ⏱️ -21s
  2 suites ± 0     1 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit 7134607. ± Comparison against base commit 792e96d.

This pull request removes 8 and adds 19 tests. Note that renamed tests count towards both.
alphabets::tests ‑ full_sets
src/alignment/mod.rs - alignment::MASA ‑ from_aligned (line 385)
src/alignment/mod.rs - alignment::MASA ‑ from_aligned_with_ancestral_unchecked (line 466)
src/alignment/mod.rs - alignment::MSA ‑ from_aligned_unchecked (line 264)
src/alignment/mod.rs - alignment::MSA ‑ len (line 201)
src/alignment/mod.rs - alignment::MSA ‑ seq_count (line 229)
src/phylo_info/phyloinfo_builder.rs - phylo_info::phyloinfo_builder::PhyloInfoBuilder<A,AA> ‑ alphabet (line 103)
src/phylo_info/phyloinfo_builder.rs - phylo_info::phyloinfo_builder::PhyloInfoBuilder<A,AA> ‑ build_w_rng (line 130)
alphabets::tests ‑ ambiguous_sets
alphabets::tests ‑ dna_alphabet_display
alphabets::tests ‑ protein_alphabet_display
src/alignment/mod.rs - alignment::MASA ‑ from_aligned (line 384)
src/alignment/mod.rs - alignment::MASA ‑ from_aligned_with_ancestral_unchecked (line 465)
src/alignment/mod.rs - alignment::MSA ‑ from_aligned_unchecked (line 263)
src/alignment/mod.rs - alignment::MSA ‑ len (line 200)
src/alignment/mod.rs - alignment::MSA ‑ seq_count (line 228)
src/alphabets/mod.rs - alphabets::Alphabet ‑ ambiguous (line 112)
src/alphabets/mod.rs - alphabets::Alphabet ‑ char_encoding (line 156)
…

♻️ This comment has been updated with latest results.

@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 92.68293% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.13%. Comparing base (527040d) to head (7134607).
⚠️ Report is 22 commits behind head on develop.

Files with missing lines Patch % Lines
phylo/src/alphabets/mod.rs 88.23% 4 Missing ⚠️
phylo/src/pip_model/mod.rs 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #125      +/-   ##
===========================================
+ Coverage    96.32%   97.13%   +0.80%     
===========================================
  Files           32       45      +13     
  Lines         4355     5656    +1301     
===========================================
+ Hits          4195     5494    +1299     
- Misses         160      162       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@junniest junniest merged commit a06fb19 into develop Nov 22, 2025
16 of 17 checks passed
@junniest junniest deleted the enhancement/alphabet-refactor branch November 22, 2025 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants