Skip to content

使用vllm推理,显存有限如何设置max_model_len的大小呢? #57

@hoyoung2015

Description

@hoyoung2015

如题,max_model_len设置过大会占用过多显存,batch_size上不去,如果设置过小又怕影响效果

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions