Commit b06b752
authored
Optimize the performance of FlashBert on HPU by using fast mode softmax (#555)
Signed-off-by: Liu, Kaixuan <[email protected]>1 parent 8eb7a84 commit b06b752
File tree
2 files changed
+10
-6
lines changed- backends/python/server/text_embeddings_server
- models
- utils
2 files changed
+10
-6
lines changedLines changed: 9 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
305 | 305 | | |
306 | 306 | | |
307 | 307 | | |
| 308 | + | |
308 | 309 | | |
309 | 310 | | |
310 | 311 | | |
| |||
326 | 327 | | |
327 | 328 | | |
328 | 329 | | |
329 | | - | |
| 330 | + | |
330 | 331 | | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
335 | 339 | | |
336 | 340 | | |
337 | 341 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
| 81 | + | |
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| |||
0 commit comments