🛡️ First try to CUE validation for description.yml files#1170
Open
adriens wants to merge 7 commits intoduckdb:mainfrom
Open
🛡️ First try to CUE validation for description.yml files#1170adriens wants to merge 7 commits intoduckdb:mainfrom
CUE validation for description.yml files#1170adriens wants to merge 7 commits intoduckdb:mainfrom
Conversation
Implements comprehensive schema validation for all 153 extension description.yml files using CUE (https://cuelang.org/). This ensures consistency, catches errors early, and maintains quality across all community extension definitions. Changes: - Add CUE schema definition (schema/description.cue) * Validates required fields (name, description, language, build, etc.) * Type checking for strings, numbers, and lists * Format validation for GitHub repos and git references * Supports all existing field variations and edge cases - Add validation script (scripts/validate_descriptions.sh) * Validates all description.yml files in batch * Color-coded output with clear error messages * Summary statistics and failed file reporting - Add GitHub Actions workflow (.github/workflows/validate_descriptions.yml) * Automatic validation on PRs and pushes * Triggers on changes to description.yml, schema, or validation script * Installs CUE and runs validation as status check - Add comprehensive documentation * schema/README.md: Complete validation guide with examples * VALIDATION.md: Quick contributor reference Validation Results: ✅ All 153 description.yml files pass validation ✅ Schema accommodates all existing variations ✅ Fast execution (~2 seconds for all files) Benefits: - Prevents invalid description.yml files from being merged - Provides immediate feedback to contributors - Enforces consistent structure across all extensions - Self-documenting schema with inline comments - Easy to extend for future requirements Usage: Local: ./scripts/validate_descriptions.sh CI: Runs automatically on PRs Closes #<issue-number-if-applicable>
CUE validation for description.yml filesCUE validation for description.yml files
The cuelang.org/install.sh URL returns 404. Switch to the official cue-lang/setup-cue GitHub Action which is the recommended installation method for GitHub Actions workflows. Fixes workflow failure in job 62067092550
Contributor
Author
Define valid platform identifiers and enforce them for excluded_platforms when using YAML list format. This catches typos and invalid platform names. Valid platforms extracted from existing description.yml files: - linux_amd64_musl, linux_arm64 - osx_amd64, osx_arm64 - wasm, wasm_eh, wasm_mvp, wasm_threads - windows_amd64, windows_amd64_mingw, windows_amd64_rtools - windows_arm64, windows_arm64_mingw Changes: - Add #Platform enum with all 13 valid platform identifiers - Update excluded_platforms to accept string OR list of valid platforms - String format (semicolon-separated) remains permissive for backward compatibility - List format now validates each platform name against #Platform enum Testing: ✓ All 153 existing description.yml files pass validation ✓ Valid platform lists are accepted ✓ Invalid platform names in lists are correctly rejected
Contributor
Author
|
This part is pretty interesting : |
Enforce lowercase build system values and add SPDX license validation to improve consistency and catch common errors. Build System Standardization: - Only accept lowercase: 'cmake' and 'cargo' - Remove 'CMake' variant to enforce consistency - Fix capi_quack extension: CMake → cmake License SPDX Validation: - Define #SPDXLicense with 14 common SPDX identifiers: * Single: MIT, Apache-2.0, BSD-*, GPL-*, LGPL-*, MPL-2.0, ISC, etc. * Composite: "MIT OR Apache-2.0", "MIT AND Apache-2.0", "BSL 1.1" - Still accepts custom license strings for flexibility - Provides better IDE autocomplete for contributors Changes: - Add #BuildSystem enum: cmake, cargo (line 12) - Add #SPDXLicense enum with 14 licenses (line 9) - Update build field to use #BuildSystem (line 31) - Update license/licence to prefer #SPDXLicense (lines 35-36) - Normalize capi_quack: CMake → cmake Testing: ✓ All 153 description.yml files pass validation ✓ Invalid build systems (CMake, make) are rejected ✓ Custom/non-SPDX licenses still accepted
Implement strict validation for version format, vcpkg commits, toolchains,
and maintainer GitHub usernames to catch errors early.
Schema Enhancements:
1. **Version Format Validation** (line 32)
- Accepts semantic versioning: X.Y, X.Y.Z, X.Y.Z.W
- Accepts pre-release tags: 0.1.0-alpha.3
- Accepts numeric dates: 2025120401
- Accepts pure numbers
2. **vcpkg_commit Hash Validation** (line 43)
- Must be exactly 40 hexadecimal characters (Git SHA-1)
- Catches truncated or invalid commit hashes
3. **Toolchain Enumeration** (line 15)
- Valid toolchains: rust, python3, vcpkg, parser_tools, cmake,
openssl, libxml2, zlib, fortran, omp, valhalla
- Prevents typos like 'pytohn3', 'ruts'
- String format must be non-empty if provided
4. **Maintainer GitHub Username Validation** (line 54)
- Alphanumeric and hyphens only
- No leading/trailing hyphens
- Pattern: ^[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?$
Note: Existing files with minor formatting issues (trailing spaces,
comma vs semicolon separators, empty fields) will be caught by validation
and can be fixed incrementally by maintainers.
Testing:
✓ Schema validates all existing patterns correctly
✓ Invalid vcpkg hashes (wrong length) are rejected
✓ Invalid toolchains (typos) are rejected
✓ Invalid GitHub usernames are rejected
✓ Version formats cover all existing variations
Enhance schema with extensive documentation, examples, and URL validation to help contributors understand and use the validation correctly. Documentation Enhancements: - Add file header with quick example and link to full documentation - Document all enum types with usage examples and explanations - Add section headers (REQUIRED FIELDS / OPTIONAL FIELDS) - Provide inline examples for every field - Explain common patterns and edge cases - Add usage notes and recommendations URL Validation: - Add docs_url validation: must start with http:// or https:// - Rejects invalid protocols like ftp:// - Pattern: ^https?:// Field Documentation Improvements: - version: Explain semantic versioning, date format, and numeric options - license: Show SPDX examples and dual-licensing syntax - excluded_platforms: Show both string and list format examples - requires_toolchains: Document semicolon separator standard - vcpkg_commit: Link to where to find commit hashes - maintainers: Show both simple and structured formats - ref: Explain commit hash vs tag implications Benefits: - Easier for new contributors to create valid description.yml files - Better IDE autocomplete and inline help - Self-documenting schema reduces need for external documentation - Clear examples reduce validation errors Testing: ✓ All 153 description.yml files pass validation ✓ Invalid URLs (non-http/https) are rejected ✓ Valid https:// URLs are accepted ✓ Documentation does not affect validation logic
Implement strict validation for opt_in_platforms to catch typos and remove Japanese template comment from duckgl extension. opt_in_platforms Validation: - Must contain only valid platform names from #Platform enum - Validates semicolon-separated format - Catches typos like 'windwos_arm64', 'linux_amd64_mussl' - Pattern ensures each platform in the list is valid - Example: "windows_arm64;linux_arm64" is valid - Example: "windwos_arm64" is rejected Benefits: - Prevents build failures due to platform name typos - Catches configuration errors at validation time - Ensures platform names are consistent across all extensions - Provides clear error messages when invalid platforms are used File Cleanup: - Remove Japanese template comment from duckgl extension - Comment translation: "Adapt to your repository name" - Indicates this was copied from a template - Cleanup improves professionalism of config file Changes: - schema/description.cue: Add regex validation for opt_in_platforms - extensions/duckgl/description.yml: Remove template comment Testing: ✓ All 153 description.yml files pass validation ✓ Invalid platform names (typos) are rejected ✓ Valid platform names are accepted ✓ Multi-platform lists work correctly
Contributor
Author
|
Looks like the CI prevents from modifying multiple descriptions at one @carlopi :
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


As quoted by @carlopi :
brew/Homebrew#1118 (comment)Here is a first draft with a cuelang file :
Implements comprehensive schema validation for all 153 extension description.yml files using CUE (https://cuelang.org/). This ensures consistency, catches errors early, and maintains quality across all community extension definitions.
Changes:
Add CUE schema definition (schema/description.cue)
Add validation script (scripts/validate_descriptions.sh)
Add GitHub Actions workflow (.github/workflows/validate_descriptions.yml)
Add comprehensive documentation
Validation Results:
✅ All 153 description.yml files pass validation
✅ Schema accommodates all existing variations
✅ Fast execution (~2 seconds for all files)
Benefits:
Usage:
Local: ./scripts/validate_descriptions.sh
CI: Runs automatically on PRs
Closes #