[Bug] Shared Memory Overflow in Generated CUDA Kernel When Compiling ONNX Model via Relay Build


The error occurs during the `relay.build()` phase.

---

### Environment

- OS: Ubuntu 20.04
- Python: 3.8
- TVM Version: v0.9.0 (built from source)
- CUDA Version: 11.1
- GPU: NVIDIA RTX 2080Ti
- ONNX: 1.14.1
- ONNX model: See attached (simplified) model

---

### Steps to reproduce

```python
import onnx
import tvm
from tvm import relay

# Load the uploaded ONNX model
model = onnx.load("original.onnx")

# Provide input shape
shape_dict = {"input": (1, 32, 28, 28)}

# Convert ONNX to Relay
mod, params = relay.frontend.from_onnx(model, shape_dict)

# Try to compile for CUDA
target = "cuda"
with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod, target=target, params=params)

[error.txt](https://github.com/user-attachments/files/20922208/error.txt)

[input_info.txt](https://github.com/user-attachments/files/20922217/input_info.txt)

[relay.txt](https://github.com/user-attachments/files/20922219/relay.txt)


### Model Attachment

Due to GitHub not supporting `.onnx` uploads in Issues, the ONNX model has been attached in base64 format as `original.onnx.txt`. You can decode it as follows:

```python
import base64

with open("original.onnx.txt", "r") as f:
    b64 = f.read()

with open("original.onnx", "wb") as f:
    f.write(base64.b64decode(b64))

[original.onnx.txt](https://github.com/user-attachments/files/20922261/original.onnx.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Shared Memory Overflow in Generated CUDA Kernel When Compiling ONNX Model via Relay Build #18094

Environment

Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Shared Memory Overflow in Generated CUDA Kernel When Compiling ONNX Model via Relay Build #18094

Description

Environment

Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions