Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions demos/embeddings/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,7 @@ mteb run -m thenlper/gte-small -t Banking77Classification --output_folder result

# Usage of tokenize endpoint (release 2025.4 or weekly)

The `tokenize` endpoint provides a simple API for tokenizing input text using the same tokenizer as the deployed embeddings model. This allows you to see how your text will be split into tokens before feature extraction or inference. The endpoint accepts a string or list of strings and returns the corresponding token IDs and tokenized text.
The `tokenize` endpoint provides a simple API for tokenizing input text using the same tokenizer as the deployed embeddings model. This allows you to see how your text will be split into tokens before feature extraction or inference. The endpoint accepts a string or list of strings and returns the corresponding token IDs.

Example usage:
```console
Expand All @@ -524,10 +524,10 @@ Response:
```

It's possible to use additional parameters:
- pad_to_max_length - whether to pad the sequence to the maximum length. Default is False.
- max_length - maximum length of the sequence. If None (default), the value will be taken from the IR (where default value from original HF/GGUF model is stored).
- padding_side - side to pad the sequence, can be ‘left’ or ‘right’. Default is None.
- add_special_tokens - whether to add special tokens like BOS, EOS, PAD. Default is True.
- `pad_to_max_length` - whether to pad the sequence to the maximum length. Default is False.
- `max_length` - maximum length of the sequence. If None (default), unlimited.
- `padding_side` - side to pad the sequence, can be ‘left’ or ‘right’. Default is None.
- `add_special_tokens` - whether to add special tokens like BOS, EOS, PAD. Default is True.

Example usage:
```console
Expand Down
6 changes: 3 additions & 3 deletions src/embeddings/embeddings_calculator_ov.cc
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ class EmbeddingsCalculatorOV : public CalculatorBase {

mediapipe::Timestamp timestamp{0};

absl::Status tokenizeStrings(ov::genai::Tokenizer& tokenizer, const std::vector<std::string>& inputStrings, const ov::AnyMap& parameters, ov::genai::TokenizedInputs& tokens, const size_t& max_context_length) {
absl::Status tokenizeStrings(ov::genai::Tokenizer& tokenizer, const std::vector<std::string>& inputStrings, const ov::AnyMap& parameters, ov::genai::TokenizedInputs& tokens) {
tokens = tokenizer.encode(inputStrings, parameters);
RET_CHECK(tokens.input_ids.get_shape().size() == 2);

Expand Down Expand Up @@ -134,7 +134,7 @@ class EmbeddingsCalculatorOV : public CalculatorBase {
}
auto input = tokenizeRequest.input;
if (auto strings = std::get_if<std::vector<std::string>>(&input)) {
auto tokenizationStatus = this->tokenizeStrings(embeddings_session->getTokenizer(), *strings, tokenizeRequest.parameters, tokens, max_context_length);
auto tokenizationStatus = this->tokenizeStrings(embeddings_session->getTokenizer(), *strings, tokenizeRequest.parameters, tokens);
if (!tokenizationStatus.ok()) {
return tokenizationStatus;
}
Expand Down Expand Up @@ -172,7 +172,7 @@ class EmbeddingsCalculatorOV : public CalculatorBase {
params["max_length"] = max_context_length;
}

absl::Status tokenizationStatus = this->tokenizeStrings(embeddings_session->getTokenizer(), *strings, params, tokens, max_context_length);
absl::Status tokenizationStatus = this->tokenizeStrings(embeddings_session->getTokenizer(), *strings, params, tokens);
if (!tokenizationStatus.ok()) {
return tokenizationStatus;
}
Expand Down