feat(rosetta): Architecture::Bloom variant + BLOOM model-family contract (closes #1586)#1694
Merged
Conversation
…act (closes #1586) BLOOM (BigScience) is a GPT-2-derivative architecture with: - ALiBi linear position bias (no positional-embedding tensor) - Fused QKV (`self_attention.query_key_value`, Q/K/V interleaved per head) - GELU MLP, LayerNorm, biases everywhere, tied embeddings - HuggingFace `h.N.*` naming (NOT `model.layers.N.*`) The HF tensor names diverge from every existing mapper, so a new `Architecture::Bloom` variant is added rather than reusing `Llama` or `Gpt2` mappers. Engine changes (single function each): converter_types.rs::Architecture + Bloom variant tensor_expectation.rs::map_name + Bloom → bloom_map_name tensor_expectation.rs::is_llm + Bloom tensor_expectation.rs::display_name + "BLOOM" tensor_expectation.rs::from_model_type + "bloom" | "bloomz" → Bloom tensor_expectation.rs::bloom_map_name NEW function (50 LOC) bloom_map_name translates: word_embeddings.weight → model.embed_tokens.weight word_embeddings_layernorm.{w,b} → model.embed_norm.{w,b} h.N.input_layernorm.{w,b} → model.layers.N.input_layernorm.{w,b} h.N.self_attention.query_key_value → model.layers.N.self_attn.qkv_proj h.N.self_attention.dense → model.layers.N.self_attn.o_proj h.N.post_attention_layernorm.{w,b} → model.layers.N.post_attention_layernorm.{w,b} h.N.mlp.dense_h_to_4h → model.layers.N.mlp.up_proj h.N.mlp.dense_4h_to_h → model.layers.N.mlp.down_proj ln_f.{w,b} → model.norm.{w,b} Fused QKV is kept fused at this layer; splitting into separate q/k/v tensors must happen at the conversion layer (BLOOM interleaves Q/K/V per head, not concatenated like GPT-NeoX). YAML at contracts/model-families/bloom.yaml covers BLOOM-560M and BLOOM-7B1 size variants (shared 250880-token vocab). Out of scope (separate tickets): - ALiBi runtime inference support — `is_inference_verified()` returns false for BLOOM; the engine has no ALiBi position-bias code path - Fused QKV splitter at conversion layer Verified: - `pv validate contracts/model-families/bloom.yaml` → 0 errors - All 3 falsifiers pass: FALSIFY-PARITY-002 (every YAML mapped), FALSIFY-MF-006 (no duplicate arch classes), FALSIFY-MF-011 (vocab consistency) - All 13764 aprender-core --lib tests pass
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Re-authors PR #1685 (which itself superseded #1671) on a clean cherry-pick
onto current `main` — DIRTY again after #1686 (InternLM2) landed and
introduced an interleaving conflict on `tensor_expectation.rs` /
`converter_types.rs`.
Now coexists cleanly with both FalconClassic (#1673) and InternLM2 (#1686)
in the enum + dispatch arms.
Adds:
Out of scope (separate tickets):
Verified
Closes #1586. Supersedes #1671 / #1685.
🤖 Generated with Claude Code