Version: 0.1.0
Status: ✅ Implemented
Feature: 029-canonical-json-pattern
The canonical JSON format provides a standardized, deterministic JSON representation of Pattern<Subject> structures. This format enables:
- Interoperability: Seamless data exchange between implementations
- Validation: Formal JSON Schema for verification
- Type Safety: Generated TypeScript and Rust definitions
- Testing: Byte-for-byte comparison for equivalence checking
- Determinism: Identical structures produce identical JSON (sorted keys)
- Completeness: All pattern features representable in JSON
- Simplicity: Direct mapping between Haskell and JSON types
- Standards Compliance: Valid JSON Schema Draft 2020-12
{
"subject": <Subject>,
"elements": [<Pattern>, ...]
}Fields:
subject: Subject (required)elements: Array of nested Pattern objects (required, may be empty)
{
"identity": <string>,
"labels": [<string>, ...],
"properties": {<string>: <Value>, ...}
}Fields:
identity: Identity string (required, may be empty for anonymous subjects)labels: Array of label strings (required, sorted alphabetically in canonical form)properties: Map of property names to values (required, keys sorted alphabetically in canonical form)
Values use a discriminated union approach with 10 supported types:
Integer:
42Decimal:
3.14Boolean:
trueString:
"Hello, World!"Symbol:
{
"type": "symbol",
"value": "identifier"
}Tagged String:
{
"type": "tagged",
"tag": "json",
"content": "{\"key\": \"value\"}"
}Range:
{
"type": "range",
"lower": 1.0,
"upper": 10.0
}Note: lower and upper can be null for unbounded ranges
Measurement:
{
"type": "measurement",
"unit": "kg",
"value": 5.0
}Array:
[1, 2, 3, "mixed", true]Arrays can contain any Value types (including nested arrays)
Map:
{
"key1": "value1",
"key2": 42
}Maps do NOT have a type field (this distinguishes them from complex types)
Gram:
(person {name: "Alice", age: 30})
JSON:
{
"subject": {
"identity": "person",
"labels": [],
"properties": {
"age": 30,
"name": "Alice"
}
},
"elements": []
}Gram:
(alice:Person:User {active: true})
JSON:
{
"subject": {
"identity": "alice",
"labels": ["Person", "User"],
"properties": {
"active": true
}
},
"elements": []
}Gram:
[parent | child | grandchild]
JSON:
{
"subject": {
"identity": "parent",
"labels": [],
"properties": {}
},
"elements": [
{
"subject": {
"identity": "child",
"labels": [],
"properties": {}
},
"elements": [
{
"subject": {
"identity": "grandchild",
"labels": [],
"properties": {}
},
"elements": []
}
]
}
]
}Symbol Value:
{
"subject": {
"identity": "node",
"labels": [],
"properties": {
"type": {
"type": "symbol",
"value": "identifier"
}
}
},
"elements": []
}Range Value:
{
"subject": {
"identity": "sensor",
"labels": [],
"properties": {
"range": {
"type": "range",
"lower": 0.0,
"upper": 100.0
}
}
},
"elements": []
}Measurement Value:
{
"subject": {
"identity": "reading",
"labels": [],
"properties": {
"temperature": {
"type": "measurement",
"unit": "°C",
"value": 23.5
}
}
},
"elements": []
}Array Value:
{
"subject": {
"identity": "data",
"labels": [],
"properties": {
"numbers": [1, 2, 3, 4, 5],
"mixed": [42, "text", true, {"nested": "map"}]
}
},
"elements": []
}Map Value:
{
"subject": {
"identity": "config",
"labels": [],
"properties": {
"settings": {
"enabled": true,
"timeout": 30,
"endpoint": "https://example.com"
}
}
},
"elements": []
}The canonical form enforces additional constraints for deterministic output:
- Sorted Keys: All object keys sorted alphabetically
- Consistent Formatting: 2-space indentation, Unix line endings
- No Trailing Whitespace: Clean output
- Stable Metadata: Fixed timestamp/hash for deterministic mode
Generate Canonical JSON:
gramref parse input.gram --format json --value-only --canonicalRoundtrip Test:
# Gram → JSON → Gram
echo '(person {name: "Alice"})' | gramref parse --format json --value-only | \
gramref convert --from json --to gramA formal JSON Schema (Draft 2020-12) is available:
gramref schema --format json-schema > pattern-schema.jsonThis schema can be used to:
- Validate JSON output from implementations
- Generate types in other languages
- Document the canonical structure
Type-safe definitions are available for TypeScript and Rust:
TypeScript:
gramref schema --format typescript > pattern.tsRust:
gramref schema --format rust > pattern.rsJSON does not distinguish between integers and decimals that are whole numbers. For example:
2.0serializes as2-1.0serializes as-1
Implementations should handle this by:
- Using semantic equivalence for comparison
- Accepting both integer and decimal representations during parsing
- Empty arrays:
[] - Empty objects:
{} - Empty labels:
[] - Empty properties:
{} - Anonymous subject:
{"identity": "", ...}
Strings are JSON-escaped following standard rules:
- Quotes:
\" - Newlines:
\n - Backslashes:
\\ - Unicode:
\uXXXX
The canonical JSON format is validated through:
- 35+ Unit Tests: Covering all value types and patterns
- 200 QuickCheck Properties: Random pattern generation and roundtrip testing
- Semantic Equivalence: Handling integer/decimal ambiguity
- Corpus Tests: Validation against tree-sitter-gram test corpus
- Gram Serialization - Overall serialization features
- JSON Schema Specification - Formal schema
- TypeScript Types - Type definitions
- Rust Types - Struct definitions