AI: `-nvidia "all"` is not obvious what it does

**Describe the bug**
When the flag `-nvidia "all"` is used for AI inference, the `aiModels.json` must match number of GPUs installed, else it will try run all GPUs with the first item in the list.

Not sure if this is the expected behavior.

**To Reproduce**
1. Install 2 GPUs in a machine
2. Set `-nvidia "all"`
3.  Set your `aiModels.json` to include only the LLM model
```
[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    }
]
```
4. If the second GPU has less than 24GB, it will fail to launch to container and time out.

**Expected behavior**
Obviously if we specify the GPUs ie `-nvidia 0,1` then we assume the `aiModels.json` will have two models loaded, it works.
i.e.
```
[
    {
    "pipeline": "llm",
    "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "price_per_unit": 80000000,
    "pixels_per_unit": 1000000,
    "warm": true
    },
    {
      "pipeline": "text-to-image",
      "model_id": "ByteDance/SDXL-Lightning",
      "price_per_unit": 4768371,
      "warm": true
    }
]
```
I think the documentation needs to be updated, but I am unsure if this is actually the logic for the `"all"` value.

**Set Up**
Slot 0 - 3090 (24GB Ram)
Slot 1 - 2080 ti (11GB Ram)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI: `-nvidia "all"` is not obvious what it does #3370

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AI: -nvidia "all" is not obvious what it does #3370

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

AI: `-nvidia "all"` is not obvious what it does #3370