@@ -53,23 +53,26 @@ such as:
5353
5454### Supported Models
5555
56- You can use any BERT, CamemBERT or XLM-RoBERTa model with absolute positions in ` text-embeddings-inference ` .
56+ You can use any JinaBERT model with Alibi or absolute positions or any BERT, CamemBERT or XLM-RoBERTa model with
57+ absolute positions in ` text-embeddings-inference ` .
5758
5859** Support for other model types will be added in the future.**
5960
6061Examples of supported models:
6162
62- | MTEB Rank | Model Type | Model ID |
63- | -----------| --------------| --------------------------------------------------------------------------------|
64- | 1 | Bert | [ BAAI/bge-large-en-v1.5] ( https://hf.co/BAAI/bge-large-en-v1.5 ) |
65- | 2 | | [ BAAI/bge-base-en-v1.5] ( https://hf.co/BAAI/bge-base-en-v1.5 ) |
66- | 3 | | [ llmrails/ember-v1] ( https://hf.co/llmrails/ember-v1 ) |
67- | 4 | | [ thenlper/gte-large] ( https://hf.co/thenlper/gte-large ) |
68- | 5 | | [ thenlper/gte-base] ( https://hf.co/thenlper/gte-base ) |
69- | 6 | | [ intfloat/e5-large-v2] ( https://hf.co/intfloat/e5-large-v2 ) |
70- | 7 | | [ BAAI/bge-small-en-v1.5] ( https://hf.co/BAAI/bge-small-en-v1.5 ) |
71- | 10 | | [ intfloat/e5-base-v2] ( https://hf.co/intfloat/e5-base-v2 ) |
72- | 11 | XLM-RoBERTa | [ intfloat/multilingual-e5-large] ( https://hf.co/intfloat/multilingual-e5-large ) |
63+ | MTEB Rank | Model Type | Model ID |
64+ | -----------| -------------| ----------------------------------------------------------------------------------------|
65+ | 1 | Bert | [ BAAI/bge-large-en-v1.5] ( https://hf.co/BAAI/bge-large-en-v1.5 ) |
66+ | 2 | | [ BAAI/bge-base-en-v1.5] ( https://hf.co/BAAI/bge-base-en-v1.5 ) |
67+ | 3 | | [ llmrails/ember-v1] ( https://hf.co/llmrails/ember-v1 ) |
68+ | 4 | | [ thenlper/gte-large] ( https://hf.co/thenlper/gte-large ) |
69+ | 5 | | [ thenlper/gte-base] ( https://hf.co/thenlper/gte-base ) |
70+ | 6 | | [ intfloat/e5-large-v2] ( https://hf.co/intfloat/e5-large-v2 ) |
71+ | 7 | | [ BAAI/bge-small-en-v1.5] ( https://hf.co/BAAI/bge-small-en-v1.5 ) |
72+ | 10 | | [ intfloat/e5-base-v2] ( https://hf.co/intfloat/e5-base-v2 ) |
73+ | 11 | XLM-RoBERTa | [ intfloat/multilingual-e5-large] ( https://hf.co/intfloat/multilingual-e5-large ) |
74+ | N/A | JinaBERT | [ jinaai/jina-embeddings-v2-base-en] ( https://hf.co/jinaai/jina-embeddings-v2-base-en ) |
75+ | N/A | JinaBERT | [ jinaai/jina-embeddings-v2-small-en] ( https://hf.co/jinaai/jina-embeddings-v2-small-en ) |
7376
7477
7578You can explore the list of best performing text embeddings models [ here] ( https://huggingface.co/spaces/mteb/leaderboard ) .
@@ -81,7 +84,7 @@ model=BAAI/bge-large-en-v1.5
8184revision=refs/pr/5
8285volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
8386
84- docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.2.2 --model-id $model --revision $revision
87+ docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.3.0 --model-id $model --revision $revision
8588```
8689
8790And then you can make requests like
@@ -223,15 +226,15 @@ Options:
223226
224227Text Embeddings Inference ships with multiple Docker images that you can use to target a specific backend:
225228
226- | Architecture | Image |
227- | -------------------------------------| ------------------------------------------------------------|
228- | CPU | ghcr.io/huggingface/text-embeddings-inference: cpu-0 .2.2 |
229- | Volta | NOT SUPPORTED |
230- | Turing (T4, RTX 2000 series, ...) | ghcr.io/huggingface/text-embeddings-inference: turing-0 .2.2 |
231- | Ampere 80 (A100, A30) | ghcr.io/huggingface/text-embeddings-inference:0.2.2 |
232- | Ampere 86 (A10, A40, ...) | ghcr.io/huggingface/text-embeddings-inference:86-0.2.2 |
233- | Ada Lovelace (RTX 4000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:89-0.2.2 |
234- | Hopper (H100) | ghcr.io/huggingface/text-embeddings-inference: hopper-0 .2.2 |
229+ | Architecture | Image |
230+ | -------------------------------------| --------------------------------------------------------------------------- |
231+ | CPU | ghcr.io/huggingface/text-embeddings-inference: cpu-0 .3.0 |
232+ | Volta | NOT SUPPORTED |
233+ | Turing (T4, RTX 2000 series, ...) | ghcr.io/huggingface/text-embeddings-inference: turing-0 .3.0 (experimental) |
234+ | Ampere 80 (A100, A30) | ghcr.io/huggingface/text-embeddings-inference:0.3.0 |
235+ | Ampere 86 (A10, A40, ...) | ghcr.io/huggingface/text-embeddings-inference:86-0.3.0 |
236+ | Ada Lovelace (RTX 4000 series, ...) | ghcr.io/huggingface/text-embeddings-inference:89-0.3.0 |
237+ | Hopper (H100) | ghcr.io/huggingface/text-embeddings-inference: hopper-0 .3.0 (experimental) |
235238
236239### API documentation
237240
@@ -256,7 +259,7 @@ model=<your private model>
256259volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
257260token=< your cli READ token>
258261
259- docker run --gpus all -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.2.2 --model-id $model
262+ docker run --gpus all -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.3.0 --model-id $model
260263```
261264
262265### Distributed Tracing
0 commit comments