Skip to content

feat: add Falcon (classic) (FalconForCausalLM) loader to aprender::rosetta #1587

@noahgift

Description

@noahgift

Context

The cookbook architecture-demos spec tracks Falcon (classic — 7B/40B/180B) as status: blocked. Note: this is distinct from falcon_h1 which is already supported via contracts/model-families/falcon_h1.yaml. Falcon-H1 is a hybrid SSM+transformer architecture; Falcon-classic is a pure decoder transformer with multi-query attention.

Family

  • Name: falcon (classic)
  • Vendor: TII
  • HF architectures: FalconForCausalLM
  • HF pattern: tiiuae/falcon-* (NOT tiiuae/Falcon-H1-*)
  • Reference checkpoints: tiiuae/falcon-7b, tiiuae/falcon-40b, tiiuae/falcon-rw-1b

Acceptance criteria

  • contracts/model-families/falcon.yaml exists (separate from existing falcon_h1.yaml)
  • Loader handles multi-query attention (n_kv_heads = 1 typical for falcon-7b)
  • Discriminator distinguishes classic Falcon from Falcon-H1 (no mamba_d_state / mamba_expand fields)
  • At least one inference smoke pass against tiiuae/falcon-rw-1b (smallest variant)

Unblock impact

  • Cookbook manifest flips from blocked to certified
  • Avoids confusion in cookbook detector — currently falcon_h1 discriminator works only because classic Falcon configs lack the SSM markers; adding a positive discriminator for classic Falcon hardens this

Cookbook reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions