How to select params for self extend? (llama3.1 and gemma2)

I need to extend the context length of llama3.1-8b from suppose the context length is 8k up to 128k.  And the same for gemma2.
I see that the there is example for gemma2 [here](https://github.com/datamllab/LongLM/pull/25)
However I don't know how to select these  is there any formula to follow?

Gemma2:
```
original_gemma_forward = GemmaAttention.forward
self_extend_forward = partial(
    SE.Gemma.flash_self_extend_forward, group_size_1=8, group_size_2=1024
)
```

And for llama3.1:
```
window_size = 1024
group_size = 32
use_flash = True
SelfExtend.apply(model, group_size, window_size, enable_flash_attention=use_flash, flash_attention_impl="flash_attn") ## flash_attention_impl="triton" or "flash_attn"

```

btw what is the meaning of these logs `Passkey`?
```
#Tokens of Prompt: 9992 Passkey target: 58328

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to select params for self extend? (llama3.1 and gemma2) #47

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to select params for self extend? (llama3.1 and gemma2) #47

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions