Replies: 3 comments 1 reply
-
Same issue. |
Beta Was this translation helpful? Give feedback.
-
I got it working. Here is the implementation: |
Beta Was this translation helpful? Give feedback.
-
Is there anyway to run unsloth's DeepSeek 0528 in IQ1, IQ2 quants? I have tried the DeepSeek-R1-0528-IQ2_XXS and it only consumes 100+GB of my memory. It can only run without cuda graph and output complete garbage (I mean messy words) after I say "hello" to it. If I turn on cuda graph, the KT will raise an error: I completely have no idea how to fix this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I've got the Q4_k_m downloaded and it's not working with the standard instructions. Any tips on which optimize roles to use?
I have 512 GB of RAM and 2x4090. Intel AMX processor but I can work with avx512
Beta Was this translation helpful? Give feedback.
All reactions