-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Background and Issue
Although you mentioned in your blog post that GPU usage reduces latency, I did not observe any GPU utilization when using this library out of the box. After examining the code, I couldn't find any logic that moves the model to the GPU.
I followed the example provided in the README section of the repository. Could you please confirm if there are additional steps required to enable GPU usage and achieve lower latency?
That said, after making the following changes to your code, I am now able to see GPU utilization.
Proposed Solution
Add an optional use_gpu
parameter to the GuardrailsPII validator that enables GPU acceleration when available. The implementation should:
- Add GPU device management: Automatically detect CUDA availability and move the GLiNER model to GPU when requested
- Maintain backward compatibility: Default to CPU inference to preserve existing behavior
- Graceful fallback: Fall back to CPU if GPU is requested but not available
Metadata
Metadata
Assignees
Labels
No labels