Skip to content

Megatron cp experiments#2

Open
hanwen-sun wants to merge 5 commits intov0.11.0from
megatron_experiments
Open

Megatron cp experiments#2
hanwen-sun wants to merge 5 commits intov0.11.0from
megatron_experiments

Conversation

@hanwen-sun
Copy link
Copy Markdown
Collaborator

@hanwen-sun hanwen-sun commented May 26, 2025

What this pr do:

  • provide example to train llama-1b model from scratch with megatron te context_parallel in ./magiattention_example
  • compare loss with magi_attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant