Skip to content

Conversation

amirkl94
Copy link
Contributor

@amirkl94 amirkl94 commented Jul 23, 2025

Purpose

This PR introduces a new backend for per-tensor scaled MoE from flashinfer. This backend gives a perf improvement as described below.

Accuracy tests

Ran manual lm_eval gsm8k, using the following command:

VLLM_USE_FLASHINFER_MOE_FP8=1 CUDA_VISIBLE_DEVICES=0 VLLM_USE_V1=1 VLLM_ATTENTION_BACKEND=FLASHINFER \
lm_eval --model vllm --model_args pretrained=<Llama4 Scout ckpts path>,\
tensor_parallel_size=1,max_model_len=2048,kv_cache_dtype=auto \
--gen_kwargs temperature=0.0 --limit 500 --trust_remote_code \
--tasks gsm8k --num_fewshot 5 --batch_size 200

Results:

flashinfer backend:
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.932|±  |0.0113|
|     |       |strict-match    |     5|exact_match|↑  |0.910|±  |0.0128|

default:
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.932|±  |0.0113|
|     |       |strict-match    |     5|exact_match|↑  |0.916|±  |0.0124|

Perf tests

Tested on a 1xB200 gpu, using latency benchmark:

VLLM_USE_V1=1 VLLM_USE_FLASHINFER_MOE_FP8=1 VLLM_USE_STANDALONE_COMPILE=0 python benchmarks/benchmark_latency.py --model=$model_dir --output-len=1024 --tensor-parallel-size=1 --input-len=128 --trust_remote_code --max-model-len=2048 --batch-size=1

Results:

flashinfer backend: Avg latency: 8.699801150080749 seconds
default:Avg latency: 12.011667945154477 seconds

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new FlashInfer backend for per-tensor scaled FP8 Mixture of Experts (MoE), which shows promising performance improvements on SM100 architectures. The changes include adding a new custom operator, refactoring some utility functions into a shared module, and updating the quantization layers to use this new backend.

The code is generally well-structured, and the refactoring of utility functions into flashinfer_utils.py is a good step towards better code organization.

However, there are a couple of areas that could be improved for better maintainability and potentially better performance:

  1. There is significant code duplication in the logic that invokes the new MoE kernel from both the Fp8MoEMethod and ModelOptFp8MoEMethod. This should be refactored into a shared helper function.
  2. The tile_tokens_dim parameter for the new kernel is hardcoded, which might not be optimal for all workloads and differs from the dynamic approach used in the existing block-scale kernel.

Addressing these points will enhance the quality and robustness of the new backend.

Comment on lines 456 to 474
return torch.ops.vllm.flashinfer_fused_moe_per_tensor_scale_fp8(
routing_logits=router_logits,
routing_bias=e_score_correction_bias,
hidden_states=x,
input_scale=layer.w13_input_scale,
gemm1_weights=layer.w13_weight,
gemm1_weights_scale=layer.w13_weight_scale,
gemm2_weights=layer.w2_weight,
gemm2_weights_scale=layer.w2_weight_scale,
activation_scale=layer.w2_input_scale,
num_experts=global_num_experts,
top_k=top_k,
num_expert_group=num_expert_group,
topk_group=topk_group,
intermediate_size=layer.intermediate_size_per_partition,
local_expert_offset=layer.ep_rank * layer.local_num_experts,
local_num_experts=layer.local_num_experts,
use_routing_scales_on_input=apply_router_weight_on_input,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There appears to be significant code duplication here. The logic inside this if self.flashinfer_moe_enabled: block is nearly identical to the logic in vllm/model_executor/layers/quantization/fp8.py (lines 993-1016).

Duplicating this code block makes future maintenance harder, as changes would need to be applied in two places.

To improve maintainability, I suggest refactoring this shared logic into a common helper function. This function could be placed in a utility module, perhaps vllm/model_executor/layers/quantization/utils/flashinfer_utils.py, and called from both Fp8MoEMethod.apply and ModelOptFp8MoEMethod.apply.

For example, you could create a helper like this:

# In a shared utility file
def apply_flashinfer_per_tensor_scale_fp8(
    layer: torch.nn.Module,
    x: torch.Tensor,
    router_logits: torch.Tensor,
    e_score_correction_bias: Optional[torch.Tensor],
    top_k: int,
    num_expert_group: Optional[int],
    topk_group: Optional[int],
    global_num_experts: int,
    apply_router_weight_on_input: bool,
) -> torch.Tensor:
    return torch.ops.vllm.flashinfer_fused_moe_per_tensor_scale_fp8(
        routing_logits=router_logits,
        routing_bias=e_score_correction_bias,
        hidden_states=x,
        input_scale=layer.w13_input_scale,
        gemm1_weights=layer.w13_weight,
        gemm1_weights_scale=layer.w13_weight_scale,
        gemm2_weights=layer.w2_weight,
        gemm2_weights_scale=layer.w2_weight_scale,
        activation_scale=layer.w2_input_scale,
        num_experts=global_num_experts,
        top_k=top_k,
        num_expert_group=num_expert_group,
        topk_group=topk_group,
        intermediate_size=layer.intermediate_size_per_partition,
        local_expert_offset=layer.ep_rank * layer.local_num_experts,
        local_num_experts=layer.local_num_experts,
        use_routing_scales_on_input=apply_router_weight_on_input,
    )

This would centralize the logic and make the code cleaner and easier to maintain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this utility to be implemented to help with drift

routing_method_type: int = 3 # Llama4-styled routing method
) -> torch.Tensor:
if routing_bias is None:
routing_bias = torch.zeros(num_experts,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little worried about this line breaking the cuda graph capture because we are creating new tensor on-the-fly. Should we create this zero-bias in the caller instead? Or maybe ask FlashInfer to support routing_bias=None so that we don't need to pass in fake bias.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocking issue for now. We will fix this later if we really see it becoming an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I think asking flashinfer to support routing_bias=None is better probably

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlashInfer has fixed this in 0.2.9rc2. Do you think this is a blocker? If not, I prefer that we merge this PR first and then file another PR after we have upgraded to FlashInfer v0.2.9rc2.

However, if you think this is a blocker, we can wait until FlashINfer v0.2.9rc2 upgrade, which should happen very soon

local_num_experts: int,
use_routing_scales_on_input: bool,
routed_scaling_factor: float = 1.0,
routing_method_type: int = 3 # Llama4-styled routing method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use RoutingMethodType.Llama4 instead of a hard-coded "3"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocking issue, just code style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better but the issue with it is that if a different version of flashinfer is installed (or flashinfer isn't installed at all) we'll get an import error. I thought about doing this conversion inside the function after we know that the correct version of flashinfer is installed, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we can define our class to mimic FlashInfer's class?

has_flashinfer = False
try:
    import flashinfer
    import flashinfer.fused_moe.RoutingMethodType
    has_flashinfer = True
except ImportError:
    pass

class FlashInferRoutingMethodType(IntEnum):
    # Default: Softmax -> TopK
    Default = RoutingMethodType.Default if has_flashinfer else 0
    # Renormalize: TopK -> Softmax
    Renormalize = RoutingMethodType.Renormalize if has_flashinfer else 1
    # DeepSeekV3: Sigmoid -> RoutingBiasAdd -> Top2 in group -> Top4 groups -> Top8 experts from the Top4 groups
    DeepSeekV3 = RoutingMethodType.DeepSeekV3 if has_flashinfer else 2
    # Llama4: Top1 -> Sigmoid
    Llama4 = RoutingMethodType.Llama4 if has_flashinfer else 3
    # Qwen3: Softmax -> TopK -> Renormalize
    RenormalizeNaive = RoutingMethodType.RenormalizeNaive if has_flashinfer else 4
    # Unspecified
    Unspecified = RoutingMethodType.Unspecified if has_flashinfer else 5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not critical and can be handled in later PRs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make this class lazy imported and remove the default arg, so we only need to import it once in the function

@nvpohanh
Copy link
Contributor

Depends on #21485

@amirkl94 amirkl94 force-pushed the feat/flashinfer-fp8-moe branch from c3e365c to 872160e Compare July 27, 2025 06:33
routing_bias = torch.zeros(num_experts,
dtype=torch.bfloat16,
device=hidden_states.device)
num_expert_group = num_expert_group if num_expert_group is not None else 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_expert_group = num_expert_group if num_expert_group is not None else 1

should set to 0 if num_expert_group is None

@amirkl94 amirkl94 force-pushed the feat/flashinfer-fp8-moe branch from 872160e to fdf635b Compare July 28, 2025 07:28
local_num_experts: int,
use_routing_scales_on_input: bool,
routed_scaling_factor: float = 1.0,
routing_method_type: int = 3 # Llama4-styled routing method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of defaulting this parameter if it is going to dictate model support. For instance in the current usage of this function this parameter isn't set, but there is no check that this is model needs llama 4 routing i.e. it would be silently incorrect for a Mixtral with the same quant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amirkl94 Maybe let's remove the default value for routing_method_type and make this arg a required argument?

And from llama4.py we should pass this into fused_moe.py?

Copy link
Member

@mgoin mgoin Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llama4 already does this by defining its own custom routing function and passing that into FusedMoE

custom_routing_function=Llama4MoE.custom_routing_function,

I suppose you could just check if custom_routing_function == Llama4MoE.custom_routing_function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can't check custom_routing_function == Llama4MoE.custom_routing_function, unless you meant in llama4.py?

Should I just make this parameter optional and pass it only from llama4 and if it's not passed I'll default to the non-flashinfer implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @mgoin meant is that in modelopt.py:

return apply_flashinfer_per_tensor_scale_fp8(

the layer object is just an instance of FusedMoE, so you can dispatch routing_method using:

if layer.routing_method == Llama4MoE.custom_routing_function:
    routing_method = 3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgoin is this what you meant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this was what I meant. Obviously not optimal, but should be okay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgoin Currently, FlashInfer's per-tensor FP8 MoE only supports Llama4 routing mode, so I told @amirkl94 to assert if layer.routing_method == Llama4MoE.custom_routing_function is True. If it is not, an exception will be raised.

This is done such that in the future if anyone wants to use FlashInfer per-tensor FP8 MoE for another model, it will fail loudly telling the user why that is not supported. My philosophy is: a loud failure is better than a silent corruption.

Could you check if the current implementation is acceptable to you? Thanks!

local_num_experts: int,
use_routing_scales_on_input: bool,
routed_scaling_factor: float = 1.0,
routing_method_type: int = 3 # Llama4-styled routing method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make this class lazy imported and remove the default arg, so we only need to import it once in the function

Signed-off-by: Amir Klein <[email protected]>
@amirkl94 amirkl94 force-pushed the feat/flashinfer-fp8-moe branch from 185bdd6 to 6582abc Compare July 29, 2025 15:02
@amirkl94 amirkl94 requested review from nvpohanh and mgoin July 30, 2025 06:58
@nvpohanh
Copy link
Contributor

pipeline failure doesn't seem to be caused by this PR:

[2025-07-29T16:52:22Z] torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.75 GiB. GPU 0 has a total capacity of 22.05 GiB of which 685.88 MiB is free. Process 6405 has 17.01 GiB memory in use. Process 41 has 250.00 MiB memory in use. Including non-PyTorch memory, this process has 4.09 GiB memory in use. Of the allocated memory 3.72 GiB is allocated by PyTorch, and 159.51 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

@mgoin
Copy link
Member

mgoin commented Jul 30, 2025

I fixed some issues with the PR and validated acc+performance. I see about 10% throughput improvement on gsm8k on 1xB200

lm_eval --model vllm --model_args pretrained=nvidia/Llama-4-Scout-17B-16E-Instruct-FP8,max_model_len=10000,quantization=modelopt --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto
Processed prompts: 100%|██████████| 1319/1319 [01:14<00:00, 17.78it/s, est. speed input: 15466.22 toks/s, output: 1821.80 toks/s]
vllm (pretrained=nvidia/Llama-4-Scout-17B-16E-Instruct-FP8,max_model_len=10000,quantization=modelopt,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.9227|±  |0.0074|
|     |       |strict-match    |     5|exact_match|↑  |0.8999|±  |0.0083|

VLLM_USE_FLASHINFER_MOE_FP8=1 lm_eval --model vllm --model_args pretrained=nvidia/Llama-4-Scout-17B-16E-Instruct-FP8,max_model_len=10000,quantization=modelopt --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto
Processed prompts: 100%|██████████| 1319/1319 [01:06<00:00, 19.83it/s, est. speed input: 17251.62 toks/s, output: 2029.75 toks/s]
vllm (pretrained=nvidia/Llama-4-Scout-17B-16E-Instruct-FP8,max_model_len=10000,quantization=modelopt,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.9196|±  |0.0075|
|     |       |strict-match    |     5|exact_match|↑  |0.9007|±  |0.0082|

Will do a final review now.

@mgoin mgoin enabled auto-merge (squash) July 31, 2025 00:38
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 31, 2025
@nvpohanh
Copy link
Contributor

The failure is:

[2025-07-31T01:06:15Z] Fork a new process to run a test 0
[2025-07-31T01:06:15Z] DEBUG 07-30 18:06:15 [__init__.py:38] Available plugins for group vllm.general_plugins:
[2025-07-31T01:06:15Z] DEBUG 07-30 18:06:15 [__init__.py:40] - lora_filesystem_resolver -> vllm.plugins.lora_resolvers.filesystem_resolver:register_filesystem_resolver
[2025-07-31T01:06:15Z] DEBUG 07-30 18:06:15 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
[2025-07-31T01:06:15Z] Traceback (most recent call last):
[2025-07-31T01:06:15Z]   File "/vllm-workspace/tests/utils.py", line 741, in wrapper
[2025-07-31T01:06:15Z]     f(*args, **kwargs)
[2025-07-31T01:06:15Z]   File "/vllm-workspace/tests/models/test_initialization.py", line 115, in can_initialize
[2025-07-31T01:06:15Z]     LLM(
[2025-07-31T01:06:15Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/llm.py", line 275, in __init__
[2025-07-31T01:06:15Z]     self.llm_engine = LLMEngine.from_engine_args(
[2025-07-31T01:06:15Z]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-31T01:06:15Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/engine/llm_engine.py", line 490, in from_engine_args
[2025-07-31T01:06:15Z]     vllm_config = engine_args.create_engine_config(usage_context)
[2025-07-31T01:06:15Z]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-31T01:06:15Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1002, in create_engine_config
[2025-07-31T01:06:15Z]     model_config = self.create_model_config()
[2025-07-31T01:06:15Z]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-31T01:06:15Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 870, in create_model_config
[2025-07-31T01:06:15Z]     return ModelConfig(
[2025-07-31T01:06:15Z]            ^^^^^^^^^^^^
[2025-07-31T01:06:15Z]   File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 120, in __init__
[2025-07-31T01:06:15Z]     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
[2025-07-31T01:06:15Z] pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
[2025-07-31T01:06:15Z]   Value error, The checkpoint you are trying to load has model type `hunyuan_v1_dense` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
[2025-07-31T01:06:15Z] 
[2025-07-31T01:06:15Z] You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git` [type=value_error, input_value=ArgsKwargs((), {'model': ...attention_dtype': None}), input_type=ArgsKwargs]
[2025-07-31T01:06:15Z]     For further information visit https://errors.pydantic.dev/2.11/v/value_error
[2025-07-31T01:06:16Z] �[31mFAILED�[0m

Doesn't seem to be related to this PR

@mgoin mgoin disabled auto-merge July 31, 2025 02:35
@mgoin mgoin enabled auto-merge (squash) July 31, 2025 02:35
@amirkl94
Copy link
Contributor Author

@mgoin The CI errors seem to be unrelated to my PR as I saw they're happening on other branches as well - https://github.com/vllm-project/vllm/pull/21747/commits .
@nvpohanh

@mgoin
Copy link
Member

mgoin commented Jul 31, 2025

Yes, this is what I've found too. I've requested force merge, thank you.

@vllm-bot vllm-bot merged commit 207b750 into vllm-project:main Jul 31, 2025
72 of 74 checks passed
wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025
wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025
juuice-lee pushed a commit to juuice-lee/vllm-moe.code that referenced this pull request Aug 5, 2025
vadiklyutiy pushed a commit to CentML/vllm that referenced this pull request Aug 5, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jingyu-ml pushed a commit to jingyu-ml/vllm that referenced this pull request Aug 8, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
…project#21458)

Signed-off-by: Amir Klein <[email protected]>
Signed-off-by: mgoin <[email protected]>
Co-authored-by: mgoin <[email protected]>
Signed-off-by: Jinzhen Lin <[email protected]>
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
taneem-ibrahim pushed a commit to taneem-ibrahim/vllm that referenced this pull request Aug 14, 2025
BoyuanFeng pushed a commit to BoyuanFeng/vllm that referenced this pull request Aug 14, 2025
…project#21458)

Signed-off-by: Amir Klein <[email protected]>
Signed-off-by: mgoin <[email protected]>
Co-authored-by: mgoin <[email protected]>
Signed-off-by: Boyuan Feng <[email protected]>
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
…project#21458)

Signed-off-by: Amir Klein <[email protected]>
Signed-off-by: mgoin <[email protected]>
Co-authored-by: mgoin <[email protected]>
Signed-off-by: Diego-Castan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants