Skip to content

[Roadmap] Kimi-K2 performance enhancement on H20 GPU #8151

@zhangxiaolei123456

Description

@zhangxiaolei123456

[Proposal] Kimi-K2 performance enhancement on H20 GPU

Summary

Our current test found that the performance of Kimi k2 under TP16 is very poor, in the input and output 3500/1500 scenarios, to meet the SLO for TTFT < 5s and TPOT < 50ms single card total throughput can only reach 36 token/s, so determine the plan aims to quickly improve the performance of Kimi k2 on H20 hardware, fix the bugs in the process, and give the best practices.

Roadmap

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions