-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Overview
Add functionality for reviewers and admins to publish metadata records directly to a GitHub repository. This feature will enable automated publishing of ISO19115-3 XML and YAML files to configurable GitHub repositories with support for multiple deployment environments.
Integration with CIOOS Infrastructure
Once published to the GitHub repository:
- GitHub Pages: The repository will host a static website that presents the metadata files in a browsable format
- CKAN Harvesting: The different Regional Association (RA) CKAN catalogues will automatically harvest the published metadata from the GitHub repository
- Catalogue Updates: The harvested metadata will be used to update the regional CKAN catalogues, making the records discoverable through the CIOOS metadata search infrastructure
This creates an automated pipeline: Metadata Entry Form → GitHub Repository → Static Website → CKAN Catalogues → Public Discovery
User Story
As a reviewer or admin
I want to publish approved metadata records to a GitHub repository
So that the metadata files are hosted on a static website and automatically harvested by Regional Association CKAN catalogues, making them discoverable through the CIOOS metadata search infrastructure
Detailed Requirements
Workflow
- Initiate Publishing: Reviewer/admin clicks a "Publish to GitHub" button on the reviewer page for a specific record
- Select Environment: A dialog appears allowing the user to select which environment(s) to publish to (e.g., dev, staging, production)
- Confirm and Customize: User confirms the selection and optionally provides a custom git commit message
- Convert to XML: The form data is converted to ISO19115-3 XML format using the existing Python Firebase conversion function
- Generate YAML: The form data is also converted to YAML format
- Upload to GitHub: Both files are committed to the configured GitHub repository using the GitHub API
- Confirmation: The frontend displays a success message with:
- Link to the commit in GitHub
- List of files uploaded
- Confirmation that the repository was successfully updated
Required Configurations (Admin Page)
The admin page should include a new section for GitHub Publishing Configuration:
GitHub Repository Settings
- Repository Owner: GitHub organization or user account name
- Default:
cioos-siooc - The owner of the repository where metadata will be published
- Default:
- Repository Name: Repository name
- Default:
cioos-siooc-forms - Combined with owner forms the full URL:
https://github.com/{owner}/{repo}
- Default:
- GitHub Token: Personal Access Token with repo write permissions (stored securely)
- Target Branch: Branch name for commits (default:
main)
Environment Configuration
- Environments List: Configurable list of environment names
- Default:
["prod"] - Example:
["dev", "staging", "prod"]
- Default:
- Each environment represents a subdirectory path:
forms/{environment}/
File Naming Configuration
- Naming Convention Template: Configurable template for output filenames
- Variables available:
{uuid},{filename},{title},{date} - Default:
{uuid}(uses the record's identifier field) - Example:
{filename}orrecord-{uuid}
- Variables available:
Repository Structure
Files will be organized as:
forms/
└── prod/ # Default environment
├── record-uuid-1.xml
├── record-uuid-1.yaml
├── record-uuid-2.xml
├── record-uuid-2.yaml
└── ...
# Optional additional environments (if configured):
forms/
├── dev/
│ ├── record-uuid-1.xml
│ └── record-uuid-1.yaml
├── staging/
│ ├── record-uuid-2.xml
│ └── record-uuid-2.yaml
└── prod/
├── record-uuid-3.xml
└── record-uuid-3.yaml
Technical Specifications
1. Admin Page Modifications
File: src/components/Pages/Admin.jsx
New UI Section: "GitHub Publishing Configuration"
Firebase Database Structure:
admin/{region}/githubCredentials/
├── owner: string (default: "cioos-siooc")
├── repo: string (default: "cioos-siooc-forms")
├── token: string (encrypted/hashed)
├── branch: string (default: "main")
├── environments: string[] (default: ["prod"])
└── fileNamingTemplate: string (default: "{uuid}")Note: Stored as a sibling to the existing dataciteCredentials path to leverage similar security patterns.
Implementation Notes:
- Follow existing patterns from
dataciteCredentialsfor secure token storage - Use Firebase Security Rules to restrict access to admin role only
- Validate owner and repo names (alphanumeric, hyphens, underscores)
- Test token validity on save (optional)
- Full repository URL is constructed as:
https://github.com/${owner}/${repo}
2. Reviewer Page Modifications
File: src/components/Pages/Reviewer.jsx
New UI Elements:
- Add "Publish to GitHub" button in the record actions menu (alongside existing Publish/Unpublish buttons)
- Create a new dialog component:
GitHubPublishDialog
Dialog Components:
<GitHubPublishDialog>
<EnvironmentSelector
environments={adminConfig.environments}
selected={selectedEnvs}
onChange={handleEnvChange}
/>
<TextField
label="Commit Message (optional)"
placeholder="Default: Publish metadata record: {record title}"
value={commitMessage}
onChange={handleCommitMessageChange}
/>
<DialogActions>
<Button onClick={handleCancel}>Cancel</Button>
<Button onClick={handlePublish} color="primary">Publish</Button>
</DialogActions>
</GitHubPublishDialog>New Functions:
handleGitHubPublish(recordId, environments, commitMessage): Orchestrates the publishing workflowpublishToGitHub(): Calls the Firebase Cloud Function
3. Firebase Cloud Function
New File: firebase-functions/functions/githubPublish.js
Function Name: githubPublishRecord
Input Parameters:
{
recordId: string,
userId: string,
region: string,
environments: string[],
commitMessage?: string
}Function Workflow:
- Authenticate the caller (verify reviewer/admin role)
- Fetch the record data from Firebase
- Load GitHub configuration from
admin/{region}/githubConfig/ - Convert record to ISO19115-3 XML using existing Python function or external API
- Convert record to YAML format
- For each environment:
- Generate filename using the configured template
- Prepare file content for XML and YAML
- Use Octokit to commit both files to GitHub
- Return success response with commit details
Implementation Pattern (following existing issue.js):
const functions = require("firebase-functions");
const { Octokit } = require("@octokit/rest");
const axios = require("axios");
exports.githubPublishRecord = functions.https.onCall(async (data, context) => {
// 1. Authentication check
if (!context.auth) {
throw new functions.https.HttpsError("unauthenticated", "User must be authenticated");
}
const { recordId, userId, region, environments, commitMessage } = data;
// 2. Fetch record from Firebase
const recordSnapshot = await admin.database()
.ref(`${region}/users/${userId}/records/${recordId}`)
.once("value");
const recordData = recordSnapshot.val();
// 3. Load GitHub config
const configSnapshot = await admin.database()
.ref(`admin/${region}/githubCredentials`)
.once("value");
const githubConfig = configSnapshot.val();
// 4. Convert to XML and YAML
// Call existing convert_metadata Python function or external API
const xmlContent = await convertToXML(recordData);
const yamlContent = await convertToYAML(recordData);
// 5. Generate filename
const filename = generateFilename(githubConfig.fileNamingTemplate, recordData);
// 6. Initialize Octokit
const octokit = new Octokit({ auth: githubConfig.token });
// Get repository owner and name from config
const { owner, repo } = githubConfig;
// 7. Commit files for each environment
const results = [];
for (const env of environments) {
const xmlPath = `forms/${env}/${filename}.xml`;
const yamlPath = `forms/${env}/${filename}.yaml`;
// Create or update files using GitHub API
const commit = await commitFilesToGitHub(
octokit,
owner,
repo,
githubConfig.branch,
[
{ path: xmlPath, content: xmlContent },
{ path: yamlPath, content: yamlContent }
],
commitMessage || `Publish metadata record: ${recordData.title.en}`,
recordData
);
results.push({
environment: env,
commitSha: commit.sha,
commitUrl: commit.html_url,
files: [xmlPath, yamlPath]
});
}
return {
success: true,
results: results
};
});Helper Functions:
function generateFilename(template, recordData) {
// Replace template variables with record data
// {uuid} -> recordData.identifier
// {filename} -> recordData.filename
// {title} -> sanitized recordData.title.en
// {date} -> current date
}
async function commitFilesToGitHub(octokit, owner, repo, branch, files, message, recordData) {
// Use GitHub API to:
// 1. Get current commit SHA for the branch
// 2. Get tree SHA for current commit
// 3. Create blobs for each file
// 4. Create new tree with updated files
// 5. Create new commit
// 6. Update branch reference
// See: https://docs.github.com/en/rest/git/commits#create-a-commit
}
async function convertToXML(recordData) {
// Call existing Python function via httpsCallable
// Or use external API: https://api.forms.cioos.ca/record
}
async function convertToYAML(recordData) {
// Call existing Python function via httpsCallable
// Or convert using record_json_to_yaml.py logic
}4. Conversion Integration
Existing Resources:
- Python function:
firebase-functions/python-functions/main.py→convert_metadata - External API:
https://api.forms.cioos.ca/record - Package:
cioos-metadata-conversion
Approach Options:
Option A: Call existing Python Firebase function
const convertMetadata = functions.httpsCallable("convert_metadata");
const xmlResult = await convertMetadata({
record_data: recordData,
output_format: "xml"
});
const yamlResult = await convertMetadata({
record_data: recordData,
output_format: "yaml"
});Option B: Call external API
const response = await axios.post("https://api.forms.cioos.ca/record", {
record_data: recordData,
output_format: "xml" // or "yaml"
});
const content = response.data;Recommendation: Use Option A (Python function) for consistency and to avoid external dependencies.
5. User Context Integration
File: src/providers/UserProvider.jsx
New Function Export:
const publishRecordToGitHub = functions.httpsCallable("githubPublishRecord");
// Add to UserContext value
value={{
// ... existing functions
publishRecordToGitHub,
}}6. Security Rules Update
File: firebase-functions/database.rules.json
Add Single Rule for GitHub Credentials (add as sibling to existing dataciteCredentials rule):
"githubCredentials": {
// Allow write access to GitHub credentials if the authenticated user's email is listed as an admin in the permissions for the region.
".write": "root.child('admin').child($regionAdmin).child('permissions').child('admins').val().contains(auth.email)",
}Location: Add this inside the admin/$regionAdmin object, right after the existing dataciteCredentials rule (around line 65).
Rationale:
- Read access: Inherited from parent rule (line 35) - allows reviewers and admins to read
- Write access: Only admins can modify GitHub credentials (matches DataCite pattern)
- Minimal change to existing rules structure
- Follows the exact same pattern as
dataciteCredentials
7. Dependencies
New npm packages (for Firebase Functions):
@octokit/rest: Already used inissue.js, no additional installation needed
Firebase Functions Configuration:
# Store GitHub token as Firebase parameter (alternative to database storage)
firebase functions:config:set github.token="ghp_xxxxxxxxxxxxx"Affected Files
New Files
-
firebase-functions/functions/githubPublish.js- New Cloud Function for GitHub integration -
src/components/Dialogs/GitHubPublishDialog.jsx- New dialog component (optional, can be inline)
Modified Files
- src/components/Pages/Admin.jsx - Add GitHub configuration section
- src/components/Pages/Reviewer.jsx - Add publish button and dialog
- src/providers/UserProvider.jsx - Expose new Cloud Function
- firebase-functions/functions/index.js - Export new function
- firebase-functions/database.rules.json - Add security rules
- src/utils/firebaseRecordFunctions.js - Add helper functions (optional)
Implementation Phases
Phase 1: Admin Configuration UI
- Create GitHub config section in Admin page
- Add Firebase database structure for GitHub settings
- Implement configuration save/load functionality
- Update security rules
Phase 2: Reviewer Publishing UI
- Add "Publish to GitHub" button to Reviewer page
- Create environment selection dialog
- Add commit message input field
- Implement frontend validation
Phase 3: Backend Function
- Create
githubPublish.jsCloud Function - Implement authentication and authorization checks
- Integrate with existing conversion functions
- Implement GitHub API integration using Octokit
- Handle file naming template logic
Phase 4: Integration & Testing
- Connect frontend UI to backend function
- Test with multiple environments
- Test re-publishing (overwrite) behavior
- Error handling and user feedback
- Display success confirmation with commit links
Phase 5: Documentation & Deployment
- Update README with new feature documentation
- Create admin guide for GitHub configuration
- Deploy Cloud Functions
- Deploy frontend updates
Testing Checklist
Admin Page
- Can save GitHub repository URL
- Can save and retrieve GitHub token securely
- Can configure target branch (default: main)
- Can add/remove/edit environments list
- Can configure file naming template
- Configuration only accessible to admins
- Configuration persists across sessions
Reviewer Page
- "Publish to GitHub" button visible to reviewers and admins
- Button disabled if GitHub not configured
- Environment selector displays configured environments
- Can select single or multiple environments
- Commit message field accepts custom text
- Default commit message displays correctly
- Loading state shown during publishing
Backend Function
- Authenticates user correctly
- Verifies reviewer/admin permissions
- Fetches record data successfully
- Converts to XML format correctly
- Converts to YAML format correctly
- Applies file naming template correctly
- Commits to correct branch
- Creates correct directory structure (forms/{env}/)
- Both XML and YAML files appear in repository
- Handles multiple environments in single request
- Overwrites existing files correctly
- Returns commit URL and details
Error Handling
- Invalid GitHub token shows meaningful error
- Network errors handled gracefully
- Missing configuration shows helpful message
- Unauthorized users receive proper error
- Conversion failures reported to user
- GitHub API rate limiting handled
Edge Cases
- Re-publishing same record overwrites files
- Special characters in filenames handled correctly
- Very long commit messages truncated appropriately
- Empty/missing title handled in commit message
- Record with missing required fields handled
- Publishing to non-existent branch handled
Success Criteria
- Admins can configure GitHub repository settings via Admin page
- Reviewers and admins can publish records to GitHub from Reviewer page
- Users can select one or more environments to publish to
- Users can provide custom commit messages
- Records are converted to both XML and YAML formats
- Files are committed to correct paths:
forms/{environment}/{filename}.{ext} - Success confirmation shows commit link and file list
- Re-publishing overwrites existing files with new commit
- All security rules properly enforced
- Feature works across all regions
Downstream Integration
GitHub Pages Static Website
The GitHub repository hosting the published metadata files should be configured with GitHub Pages to serve the files as a static website. This enables:
- Direct browsing of metadata files via HTTP/HTTPS
- Version-controlled hosting with automatic updates on each commit
- CDN-backed delivery for reliable access by harvesting systems
Configuration Requirements:
- GitHub Pages enabled on the repository
- Source branch matches the configured publish branch (default:
main) - Base URL will be:
https://{owner}.github.io/{repo}/ - Example with default repo:
https://cioos-siooc.github.io/cioos-siooc-forms/ - Metadata file URLs:
https://{owner}.github.io/{repo}/forms/{environment}/{filename}.xml - Example full URL:
https://cioos-siooc.github.io/cioos-siooc-forms/forms/prod/record-123.xml
CKAN Catalogue Harvesting
Regional Association CKAN catalogues will be configured to harvest metadata from the GitHub Pages static website:
Harvesting Configuration:
- Harvest Source Type: CSW (Catalogue Service for the Web) or WAF (Web Accessible Folder)
- Harvest URL: Points to the GitHub Pages static website
- Harvest Frequency: Configurable per CKAN instance (e.g., daily, weekly)
- Metadata Format: ISO19115-3 XML
Regional Catalogues:
- Pacific RA CKAN
- St. Lawrence RA CKAN
- Atlantic RA CKAN
- Amundsen RA CKAN
- CanWIN RA CKAN
Harvest Workflow:
- CKAN harvester periodically checks the GitHub Pages URLs for each environment
- Harvester detects new or updated XML files
- Harvester parses ISO19115-3 XML and extracts metadata fields
- CKAN dataset records are created/updated based on the harvested metadata
- Records become searchable in the regional CKAN catalogue interface
Benefits:
- Automated synchronization between metadata entry form and public catalogues
- Version control and audit trail via Git commits
- No direct database integration required between systems
- Standards-compliant metadata exchange using ISO19115-3
Future Enhancements (Out of Scope)
- Publish history tracking in Firebase
- Batch publishing (multiple records at once)
- Pull request creation instead of direct commit
- Webhook integration to notify CKAN harvesters of new content
- Preview of XML/YAML before publishing
- Rollback/unpublish functionality
- Different GitHub repos per environment
- Auto-publish on status change to "published"
- GitHub Pages index.html generation for browsing metadata files
- STAC (SpatioTemporal Asset Catalog) format output for harvesting
Security Considerations
- Token Storage: GitHub tokens should be stored securely, following existing patterns for DataCite credentials
- Access Control: Only reviewers and admins can publish; enforced both in UI and Cloud Function
- Firebase Rules: GitHub config readable by reviewers but only writable by admins
- Input Validation: Sanitize commit messages and filenames to prevent injection attacks
- Rate Limiting: Consider GitHub API rate limits (5000 requests/hour for authenticated requests)
- Audit Trail: Log all publishing actions for accountability
Additional Notes
- The feature leverages existing infrastructure (Firebase Functions, Python conversion, Octokit)
- Follows established patterns from
issue.jsfor GitHub integration - Maintains consistency with existing admin configuration patterns
- Provides flexibility through configurable environments and file naming
- Supports reviewer workflow without requiring direct GitHub access
- Integrates into the CIOOS metadata distribution pipeline:
- Source: CIOOS Metadata Entry Form (this application)
- Storage & Hosting: GitHub repository with GitHub Pages
- Discovery: Regional CKAN catalogues harvest from GitHub Pages
- End Users: Search and discover metadata through CKAN interfaces
Related Issues/PRs
- Related to existing GitHub issue creation feature in
firebase-functions/functions/issue.js - Builds on existing conversion functionality in
firebase-functions/python-functions/main.py - Complements existing publish/unpublish workflow in Reviewer page
Labels: enhancement, feature, reviewer, admin, github-integration, backend, frontend
Priority: Medium-High
Estimated Effort: 3-5 days
Assignee: TBD
Metadata
Metadata
Assignees
Labels
Type
Projects
Status