|
| 1 | +# Critical Proof System Bug Investigation |
| 2 | + |
| 3 | +Created: July 7, 2025 12:00 PM UTC |
| 4 | +Last Modified: July 7, 2025 10:39 PM UTC |
| 5 | + |
| 6 | +## Executive Summary |
| 7 | + |
| 8 | +**UPDATE (July 7, 2025 10:39 PM UTC)**: The originally reported errors are NOT reproducing. Current status shows different issues: |
| 9 | +- **Snarky**: Working correctly ✅ |
| 10 | +- **Sparky**: Partial functionality with permutation construction errors |
| 11 | + |
| 12 | +**ORIGINAL REPORT**: The o1js proof system was reported as broken for BOTH Snarky and Sparky backends. |
| 13 | + |
| 14 | +## Symptoms |
| 15 | + |
| 16 | +1. **Zero-constraint programs**: `Cannot read properties of undefined (reading '2')` |
| 17 | + - Occurs during proof verification |
| 18 | + - Affects programs with Void inputs/outputs and no constraints |
| 19 | + |
| 20 | +2. **Programs with constraints**: `Cannot read properties of undefined (reading 'toConstant')` |
| 21 | + - Occurs in field conversion logic |
| 22 | + - Error at `fields.js:14` during `toFieldConsts` |
| 23 | + |
| 24 | +3. **Cross-backend verification**: Works correctly |
| 25 | + - Snarky proofs can be verified with Sparky backend |
| 26 | + - Suggests the issue is in proof generation/verification flow, not backend compatibility |
| 27 | + |
| 28 | +## Root Cause Analysis |
| 29 | + |
| 30 | +### Error Location |
| 31 | +The primary error occurs in `/src/bindings/crypto/bindings/conversion-proof.ts`: |
| 32 | +- Line 120: `let zComm = core.polyCommToRust(commitments[2]);` |
| 33 | +- Line 200: `let openingProof = openingProofToRust(proof[2]);` |
| 34 | + |
| 35 | +The code expects proof arrays with at least 3 elements, but receives undefined or incomplete structures. |
| 36 | + |
| 37 | +### Git History Investigation |
| 38 | + |
| 39 | +1. **Commit `31c3467c4` (July 6)**: "Fix Poseidon hash implementation" |
| 40 | + - Initially suspected as root cause |
| 41 | + - Deleted 85 test files (7,846 lines) |
| 42 | + - **Reverted changes**: Issue persists, NOT the root cause |
| 43 | + |
| 44 | +2. **Commit `5155ec003` (July 6)**: "Implement kimchi permutation construction" ⚠️ **HIGHLY SUSPICIOUS** |
| 45 | + - Massive commit: 400+ files added/modified |
| 46 | + - Added permutation infrastructure |
| 47 | + - Modified constraint system handling |
| 48 | + - Timing correlates with when issues started |
| 49 | + |
| 50 | +3. **Commit `b1e84e15e` (July 5)**: "Complete rangeCheck1 implementation" |
| 51 | + - Small, focused change |
| 52 | + - Unlikely to cause systemic issues |
| 53 | + |
| 54 | +## Test Results |
| 55 | + |
| 56 | +### Zero Constraint Test |
| 57 | +```javascript |
| 58 | +// Program with NO constraints |
| 59 | +const EmptyProgram = ZkProgram({ |
| 60 | + name: 'EmptyProgram', |
| 61 | + publicInput: Void, |
| 62 | + publicOutput: Void, |
| 63 | + methods: { |
| 64 | + empty: { |
| 65 | + privateInputs: [], |
| 66 | + async method() { |
| 67 | + // No constraints |
| 68 | + } |
| 69 | + } |
| 70 | + } |
| 71 | +}); |
| 72 | +``` |
| 73 | +- **Snarky**: ❌ `Cannot read properties of undefined (reading '2')` |
| 74 | +- **Sparky**: ❌ `Cannot read properties of undefined (reading '2')` |
| 75 | + |
| 76 | +### One Constraint Test |
| 77 | +```javascript |
| 78 | +// Program with one constraint |
| 79 | +const OneConstraintProgram = ZkProgram({ |
| 80 | + name: 'OneConstraintProgram', |
| 81 | + publicInput: Field, |
| 82 | + publicOutput: Void, |
| 83 | + methods: { |
| 84 | + test: { |
| 85 | + privateInputs: [], |
| 86 | + async method(x) { |
| 87 | + x.assertEquals(Field(0)); |
| 88 | + } |
| 89 | + } |
| 90 | + } |
| 91 | +}); |
| 92 | +``` |
| 93 | +- **Snarky**: ❌ `Cannot read properties of undefined (reading 'toConstant')` |
| 94 | +- **Sparky**: ❌ `Cannot read properties of undefined (reading 'toConstant')` |
| 95 | + |
| 96 | +## Impact Assessment |
| 97 | + |
| 98 | +- **Severity**: CRITICAL |
| 99 | +- **Scope**: Entire proof system (both backends) |
| 100 | +- **Functions affected**: |
| 101 | + - Proof creation |
| 102 | + - Proof verification |
| 103 | + - Field conversion |
| 104 | + - Constraint system handling |
| 105 | + |
| 106 | +## Recommended Actions |
| 107 | + |
| 108 | +### Immediate |
| 109 | +1. **Revert commit `5155ec003`** - The massive permutation construction changes are highly suspicious |
| 110 | +2. **Restore deleted test files** from commit `31c3467c4` - These could have caught the issues |
| 111 | +3. **Add validation** in `conversion-proof.ts` before array access |
| 112 | + |
| 113 | +### Investigation |
| 114 | +1. Trace proof structure through the entire pipeline |
| 115 | +2. Check if proof format changed in recent commits |
| 116 | +3. Validate array structures before accessing elements |
| 117 | +4. Add comprehensive logging to proof generation/verification |
| 118 | + |
| 119 | +### Long-term |
| 120 | +1. Implement integration tests that would catch such systemic failures |
| 121 | +2. Add CI checks that run basic proof creation/verification |
| 122 | +3. Establish code review process for large commits |
| 123 | +4. Document expected proof structures |
| 124 | + |
| 125 | +## Current Status (July 7, 2025 10:39 PM UTC) |
| 126 | + |
| 127 | +### Test Results |
| 128 | + |
| 129 | +Running the exact test cases from this report: |
| 130 | + |
| 131 | +#### Zero Constraint Test (EmptyProgram) |
| 132 | +- **Snarky**: ✅ Success (compiles and proves) |
| 133 | +- **Sparky**: ✅ Success (compiles and proves) |
| 134 | + |
| 135 | +#### One Constraint Test (OneConstraintProgram) |
| 136 | +- **Snarky**: ✅ Success (compiles and proves) |
| 137 | +- **Sparky**: ❌ Error: "the permutation was not constructed correctly: final value" |
| 138 | + |
| 139 | +### Analysis |
| 140 | + |
| 141 | +1. **Original Errors Not Reproducing**: The specific errors mentioned ("Cannot read properties of undefined") are not occurring in the current codebase. |
| 142 | + |
| 143 | +2. **Snarky Backend**: Fully functional for both test cases. |
| 144 | + |
| 145 | +3. **Sparky Backend Issues**: |
| 146 | + - Programs with public inputs and simple operations (like `x.add(Field(1))`) generate 0 constraints during analysis phase |
| 147 | + - This causes "FieldVector.get(): Index out of bounds, got 0/0" errors during proof generation |
| 148 | + - Programs with explicit constraints (like `assertEquals`) fail with permutation construction errors |
| 149 | + |
| 150 | +4. **Constraint Generation Behavior**: |
| 151 | + - During `analyzeMethods`, both backends show 0 constraints for simple arithmetic operations |
| 152 | + - This appears to be expected behavior - constraints are generated during compilation/proving, not analysis |
| 153 | + - The "Fresh Snapshot Fix" in Sparky correctly retrieves updated constraint counts |
| 154 | + |
| 155 | +### Investigation Findings |
| 156 | + |
| 157 | +1. **Debug Mode Testing**: Even with all optimizations disabled (Debug mode), the behavior remains the same, ruling out aggressive optimization as the cause. |
| 158 | + |
| 159 | +2. **Snapshot Timing**: The constraint system snapshot mechanism works correctly - the issue is not timing-related. |
| 160 | + |
| 161 | +3. **Root Cause**: The permutation construction system in Sparky has issues when dealing with certain constraint patterns, particularly with public inputs. |
| 162 | + |
| 163 | +## Conclusion |
| 164 | + |
| 165 | +The critical system-wide breakage reported in this document is not currently present. The issues have evolved to be Sparky-specific, primarily around permutation construction and constraint generation with public inputs. The Snarky backend is functioning correctly. |
| 166 | + |
| 167 | +The massive commit `5155ec003` may still be relevant to the current Sparky issues, as it introduced the permutation infrastructure that is now showing errors. |
0 commit comments