Skip to content

QUESTION: Hello, your code is well-written. Can I leverage the TensorRT Actor in a complimentary manner? #670

@glide-the

Description

@glide-the

Hello, your code is well-crafted. It utilizes the xoscar plugin to achieve distributed scheduling. From an architectural standpoint, it employs only the actor_ref, which is more elegant compared to the HTTP interaction. You can refer to the specific code here: https://github.com/xorbitsai/inference/blob/6135eb66f1595d41a7210f9f64c3db97adf0364b/xinference/client/oscar/actor_client.py#L432C14-L432C14

Regarding the features of the RPC framework, approximately 40% of the code in xinference is dedicated to handling basic interactions: https://github.com/xorbitsai/inference/blob/main/xinference/core/supervisor.py and https://github.com/xorbitsai/inference/blob/main/xinference/core/worker.py

Hence, this framework seems destined to be unable to resolve conflicts that may arise when multiple LLMs collaborate: https://github.com/xorbitsai/inference/blob/main/xinference/model/core.py#L32

From the code, it appears that they are planning to place the TensorRT LLM in another project: https://github.com/xorbitsai/inference/blob/main/xinference/model/llm/core.py#L31

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions