Skip to content

feat: add InternLM2/2.5 (InternLM2ForCausalLM) loader to aprender::rosetta #1589

@noahgift

Description

@noahgift

Context

The cookbook architecture-demos spec tracks InternLM2.5 as status: blocked. Issue #367 ("feat: add InternLM2.5 architecture support for inference", closed 2026-04-13) was deferred ("Low — skip in QA campaign for now"). This is the re-open: the cookbook now blocks 1 family-smoke recipe + apr-model-qa-playbook playbooks until the loader lands.

Family

  • Name: internlm2_5
  • Vendor: InternLM
  • HF architectures: InternLM2ForCausalLM
  • HF pattern: internlm/internlm2_5-*
  • Reference checkpoints: internlm/internlm2_5-1_8b-chat, internlm/internlm2_5-7b-chat, internlm/internlm2_5-20b-chat

Known gating constraint (from #367)

InternLM2 uses non-standard tensor naming: attention.wqkv (fused QKV) instead of self_attn.q_proj.weight. The loader needs to recognize this naming variant and either remap or generate fused-QKV-aware kernels.

Acceptance criteria

Unblock impact

  • Cookbook manifest flips from blocked to certified
  • 1.8b checkpoint is small enough for sub-1B dim-smoke regression coverage

Cookbook reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions