From 9fb10ad617e0a5b9661182bdf25a239748462049 Mon Sep 17 00:00:00 2001 From: Mohamed Attia Date: Fri, 14 Feb 2025 00:22:05 +0100 Subject: [PATCH] Fix reference to `Nvidia MPS` in the README. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 200dcc5269..bab1c9c06f 100644 --- a/README.md +++ b/README.md @@ -102,7 +102,7 @@ curl -X POST -d '{"model":"meta-llama/Meta-Llama-3-8B-Instruct", "prompt":"Hello Refer to [LLM deployment](docs/llm_deployment.md) for details and other methods. ## ⚡ Why TorchServe -* Write once, run anywhere, on-prem, on-cloud, supports inference on CPUs, GPUs, AWS Inf1/Inf2/Trn1, Google Cloud TPUs, [Nvidia MPS](docs/nvidia_mps.md) +* Write once, run anywhere, on-prem, on-cloud, supports inference on CPUs, GPUs, AWS Inf1/Inf2/Trn1, Google Cloud TPUs, [Nvidia MPS](docs/hardware_support/nvidia_mps.md) * [Model Management API](docs/management_api.md): multi model management with optimized worker to model allocation * [Inference API](docs/inference_api.md): REST and gRPC support for batched inference * [TorchServe Workflows](examples/Workflows/README.md): deploy complex DAGs with multiple interdependent models