[Draft] hg model integration (v0) #1323

zyaoj · 2025-09-20T11:49:39Z

What does this PR do? Please describe:

A simple model family implementation to wrap models/processor/tokenizer loaded from transformers auto classes
Currently support 4 types of Auto classes (auto, causal lm, seq2seq lm, custom)
Added basic asset examples for qwen3 and qwen2.5 omni (+doc and tests)
Exposed some additional convenience api-s (free to remove them)
Wrapping (yet another) layer for hg tokenizers (maybe not ideal but trying to make the interfaces similar to the original huggingface ones to help with the migration)
[nit] added .coveragerc and some gitignore for dev and test

Loading qwen 2.5 omni should be as simple as:

from fairseq2.models.hg import get_hg_model_hub, get_hg_tokenizer_hub

name = "hg_qwen25_omni_3b"

# Load a pre-configured model
model_hub = get_hg_model_hub()
model = model_hub.load_model(name)

# Load the corresponding tokenizer
tokenizer_hub = get_hg_tokenizer_hub()
tokenizer = tokenizer_hub.load_tokenizer(name)

Here's a paste if you want to check the full log.

Fixes #1301

TBD:

A wrapper class to handle AutoXXX.from_pretrained(...) -> Any types
Device placement and customized sharder
Integration w/ post-training recipes

Does your PR introduce any breaking changes? If yes, please list them:

Added tests for hg models so it requires installation of [hg] (e.g. `pip install -e '.[hg]') for the full test suite to pass
For hg tokenizers, added logic to retrieve default special token idx if not overridden by user input

Check list:

Was the content of this PR discussed and approved via a GitHub issue? (no need for typos or documentation improvements)
Did you read the contributor guideline?
Did you make sure that your PR does only one thing instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (no need for typos, documentation, or minor internal changes)

draft hg model integration

240440c

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Draft] hg model integration (v0) #1323

[Draft] hg model integration (v0) #1323

Uh oh!

zyaoj commented Sep 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Draft] hg model integration (v0) #1323

Are you sure you want to change the base?

[Draft] hg model integration (v0) #1323

Uh oh!

Conversation

zyaoj commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zyaoj commented Sep 20, 2025 •

edited

Loading