You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fast tokenizers for OpenAI token sets based on the [bpe](https://crates.io/crates/bpe) crate.
4
+
Serialized BPE instances are generated during build and lazily loaded at runtime as static values.
5
+
The overhead of loading the tokenizers is small because it happens only once per process and only requires deserialization (as opposed to actually building the internal data structures).
6
+
For convencience it re-exports the `bpe` crate so that depending on this crate is enough to use these tokenizers.
7
+
8
+
Supported token sets:
9
+
10
+
- r50k
11
+
- p50k
12
+
- cl100k
13
+
- o200k
14
+
15
+
## Usage
16
+
17
+
Add a dependency by running
18
+
19
+
```sh
20
+
cargo add bpe-openai
21
+
```
22
+
23
+
or by adding the following to `Cargo.toml`
24
+
25
+
```toml
26
+
[dependencies]
27
+
bpe-openai = "0.1"
28
+
```
29
+
30
+
Counting tokens is as simple as:
31
+
32
+
```rust
33
+
usebpe_openai::cl100k;
34
+
35
+
fnmain() {
36
+
letbpe=cl100k();
37
+
letcount=bpe.count("Hello, world!");
38
+
println!("{tokens}");
39
+
}
40
+
```
41
+
42
+
For more detailed documentation we refer to [bpe](https://crates.io/crates/bpe).
0 commit comments