File tree Expand file tree Collapse file tree 2 files changed +18
-1
lines changed Expand file tree Collapse file tree 2 files changed +18
-1
lines changed Original file line number Diff line number Diff line change 4
4
5
5
### Fixes and improvements
6
6
7
+ ## [ v4.4.0] ( https://github.com/OpenNMT/CTranslate2/releases/tag/v4.4.0 ) (2024-09-09)
8
+ ** Removed** : Flash Attention support in the Python package due to significant package size increase with minimal performance gain.
9
+ Note: Flash Attention remains supported in the C++ package with the ` WITH_FLASH_ATTN ` option.
10
+ Flash Attention may be re-added in the future if substantial improvements are made.
11
+
12
+ ### New features
13
+ * Support Llama3 (#1751 )
14
+ * Support Gemma2 (1772)
15
+ * Add log probs for all tokens in vocab (#1755 )
16
+ * Grouped conv1d (#1749 + #1758 )
17
+
18
+ ### Fixes and improvements
19
+ * Fix pipeline (#1723 + #1747 )
20
+ * Some improvements in flash attention (#1732 )
21
+ * Fix crash when using return_alternative on CUDA (#1733 )
22
+ * Quantization AWQ GEMM + GEMV (#1727 )
23
+
7
24
## [ v4.3.1] ( https://github.com/OpenNMT/CTranslate2/releases/tag/v4.3.1 ) (2024-06-10)
8
25
Note: Because of exceeding project's size on Pypi (> 20 GB), the release v4.3.0 was pushed unsuccessfully.
9
26
Original file line number Diff line number Diff line change 1
1
"""Version information."""
2
2
3
- __version__ = "4.3.1 "
3
+ __version__ = "4.4.0 "
You can’t perform that action at this time.
0 commit comments