Skip to content

Conversation

hero78119
Copy link
Collaborator

@hero78119 hero78119 commented Sep 12, 2025

switch to new create_proof api in ceno gpu

benchmark

on fibonacci e2e benchmark
CPU: 5900XT 32 cores
GPU: RTX 5070 TI 16GB

Benchmark Old GPU Time (s) New GPU Time (s) GPU Change (%) Speedup (×)
fibonacci_max_steps_1048576 1.162 0.897 -22.77% 1.29×
fibonacci_max_steps_2097152 1.627 1.250 -23.19% 1.30×
fibonacci_max_steps_4194304 2.441 1.919 -21.40% 1.27×

latency breakdown

master branch

TRACE    ZKVM_create_proof [ 3.85s | 0.00% / 100.00% ]
TRACE    ┝━ commit_to_pi [ 31.5µs | 0.00% ] profiling_1: true
TRACE    ┝━ commit_to_fixed_commit [ 6.54µs | 0.00% ] profiling_1: true
TRACE    ┝━ batch commit to traces [ 2.31s | 0.50% / 60.00% ] profiling_1: true
TRACE    │  ┝━ [gpu] init pp [ 1.25µs | 0.00% ] profiling_2: true
TRACE    │  ┝━ [gpu] hal init [ 116ms | 3.01% ] profiling_2: true
TRACE    │  ┝━ [gpu] batch_commit [ 2.17s | 56.44% ] profiling_2: true
TRACE    │  ┝━ [gpu] get_pure_commitment [ 220ns | 0.00% ] profiling_2: true
TRACE    │  ┝━ [gpu] get_mle_witness_from_commitment [ 2.50ms | 0.06% ] profiling_2: true
TRACE    │  ┕━ [gpu] transmute back [ 4.01µs | 0.00% ] profiling_2: true
TRACE    ┝━ transfer pk to device [ 9.19ms | 0.24% ] profiling_1: true
TRACE    ┝━ main_proofs [ 1.46s | 0.93% / 38.02% ] profiling_1: true
TRACE    │  ┝━ create_chip_proof [ 197ms | 0.00% / 5.12% ] table_name: "ADD"
TRACE    │  │  ┝━ per_layer_gen_witness [ 3.18ms | 0.08% ] profiling_2: true
TRACE    │  │  ┝━ prove_tower_relation [ 153ms | 0.00% / 3.98% ] profiling_2: true
TRACE    │  │  │  ┝━ build_tower_witness [ 28.1ms | 0.73% ] profiling_2: true
TRACE    │  │  │  ┝━ extract_out_evals_from_gpu_towers [ 531µs | 0.01% ] profiling_2: true
TRACE    │  │  │  ┕━ prove_tower_relation [ 125ms | 3.24% ] profiling_2: true
TRACE    │  │  ┕━ layer_proof [ 40.4ms | 1.05% ] profiling_2: true

target branch

TRACE    ZKVM_create_proof [ 3.47s | 0.00% / 100.00% ]
TRACE    ┝━ commit_to_pi [ 36.0µs | 0.00% ] profiling_1: true
TRACE    ┝━ commit_to_fixed_commit [ 8.01µs | 0.00% ] profiling_1: true
TRACE    ┝━ batch commit to traces [ 2.28s | 0.52% / 65.85% ] profiling_1: true
TRACE    │  ┝━ [gpu] init pp [ 1.11µs | 0.00% ] profiling_2: true
TRACE    │  ┝━ [gpu] hal init [ 108ms | 3.11% ] profiling_2: true
TRACE    │  ┝━ [gpu] batch_commit [ 2.15s | 62.11% ] profiling_2: true
TRACE    │  ┝━ [gpu] get_pure_commitment [ 230ns | 0.00% ] profiling_2: true
TRACE    │  ┝━ [gpu] get_mle_witness_from_commitment [ 3.89ms | 0.11% ] profiling_2: true
TRACE    │  ┕━ [gpu] transmute back [ 4.06µs | 0.00% ] profiling_2: true
TRACE    ┝━ transfer pk to device [ 7.75ms | 0.22% ] profiling_1: true
TRACE    ┝━ main_proofs [ 1.11s | 1.02% / 32.02% ] profiling_1: true
TRACE    │  ┝━ create_chip_proof [ 157ms | 0.00% / 4.51% ] table_name: "ADD"
TRACE    │  │  ┝━ per_layer_gen_witness [ 2.93ms | 0.08% ] profiling_2: true
TRACE    │  │  ┝━ prove_tower_relation [ 112ms | 0.00% / 3.22% ] profiling_2: true
TRACE    │  │  │  ┝━ build_tower_witness [ 16.5ms | 0.48% ] profiling_2: true
TRACE    │  │  │  ┝━ extract_out_evals_from_gpu_towers [ 480µs | 0.01% ] profiling_2: true
TRACE    │  │  │  ┕━ prove_tower_relation [ 94.7ms | 2.73% ] profiling_2: true
TRACE    │  │  ┕━ layer_proof [ 41.8ms | 1.21% ] profiling_2: true

@hero78119 hero78119 requested a review from Velaciela September 12, 2025 12:48
@hero78119 hero78119 force-pushed the feat/optimise_gpu branch 2 times, most recently from c032da7 to d33759b Compare September 12, 2025 13:27
@hero78119 hero78119 force-pushed the feat/optimise_gpu branch 2 times, most recently from 61da549 to c6dc897 Compare September 25, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant