Add SSSOM mappings for Dataset/DatasetCollection/File/FileCollection + renderer subtitle annotations#149
Merged
Merged
Conversation
These four classes were added to the D4D schema after the original
semantic exchange layer was authored, leaving them without RO-Crate
mappings. This commit closes that gap.
Semantic SSSOM (src/data_sheets_schema/alignment/d4d_rocrate_sssom_mapping.tsv):
+12 rows (95 → 107)
- DatasetCollection → schema:Dataset (exactMatch, RO-Crate root)
- DatasetCollection → dcat:Catalog (closeMatch, semantic-catalog view)
- File → schema:MediaObject (exactMatch)
- File → schema:DigitalDocument (closeMatch)
- FileCollection → schema:Dataset (exactMatch, nested in hasPart)
- FileCollection → dcat:Distribution (closeMatch)
- 6 key-slot rows: DatasetCollection.resources/FileCollection.resources →
schema:hasPart, File.file_type → d4d:fileType, FileCollection.{collection_type,
file_count, total_bytes} → d4d:collectionType / d4d:fileCount / dcat:byteSize
Structural SSSOM (data/mappings/d4d_rocrate_structural_mapping.sssom.tsv):
+6 rows (149 → 155) — slot-level rows mirroring the semantic-file slots
SKOS alignment (src/data_sheets_schema/alignment/d4d_rocrate_skos_alignment.ttl):
- Added dcat: prefix declaration
- Added 6 class-level + 6 slot-level skos triples mirroring the SSSOM rows
Per the user's note that DatasetCollection may be the RO-Crate root
(@type=["Dataset", "https://w3id.org/EVI#ROCrate"], @id="./"),
DatasetCollection is given a dual mapping: exactMatch → schema:Dataset
(root semantics) and closeMatch → dcat:Catalog (semantic-catalog view).
Out of scope for this PR (existing TODOs remain):
- src/fairscape_integration/d4d_to_fairscape.py:292-295 — converter
code does not yet traverse FileCollection.resources to emit RO-Crate
File entities. The mapping layer is now ready; converter update is
a separate follow-up.
- The generated comprehensive/uri SSSOM variants weren't regenerated;
the canonical files (semantic + structural) are the source of truth.
Validation:
- SSSOMIntegration parses both files (semantic via custom reader,
structural via sssom-py per the existing column-naming setup)
- All 190 tests in tests/test_alignment + tests/test_fairscape_integration pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A reusable Claude Code slash command that captures the workflow used in this PR — adding D4D ↔ RO-Crate / FAIRSCAPE mappings for new schema classes. The skill: - Describes the 19-column semantic SSSOM and 17-column structural SSSOM layouts and points at the canonical files - Provides a decision rubric for choosing primary/secondary RO-Crate targets based on class_uri / exact_mappings / tree_root annotations - Includes row templates and a Python helper-script skeleton - Documents standard RO-Crate target conventions (root Dataset, schema:MediaObject, dcat:Catalog, schema:hasPart, etc.) - Specifies the mandatory validation step via SSSOMIntegration + pytest - Codifies branch / commit / PR conventions - Calls out known follow-ups to keep out of scope (converter TODOs, generator regen, schema YAML touch-ups) Cross-references PR #147 as the canonical worked example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generated from the D4D ↔ RO-Crate semantic SSSOM by parsing rocrate_json_path patterns to extract entity types and their properties. Shows: - Dataset (root) with properties grouped by namespace (schema.org, DCAT, FAIRSCAPE EVI, Croissant RAI, D4D-specific) - Sub-entities: MediaObject, Person, Organization, Grant, CreativeWork, DefinedTerm - Reference edges (author/creator/contributor → Person, funder → Grant, publisher → Organization, citation → CreativeWork, about → DefinedTerm, hasPart → MediaObject) - ROCrate as root marker connected via dashed @type edge Generator: src/alignment/ (helper script captured in /tmp during this PR); rendered with graphviz dot -Gdpi=180. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-class side-by-side comparison of slot counts in the d4d-core semantic exchange layer (left, orange) versus mapped/standard RO-Crate properties on the corresponding target type (right, green). Right-side counts combine SSSOM-discovered properties with the schema.org / RO-Crate 1.1 baseline for sub-entity types (Person, Organization, Grant, MediaObject, Distribution). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… site coverage - src/data_sheets_schema/alignment/ → src/data_sheets_schema/semantic_exchange/ (canonical SKOS TTL + semantic SSSOM artifacts) - data/mappings/ → data/semantic_exchange/ (sssom-py-compatible structural mapping + analysis docs) - src/alignment/ → src/semantic_exchange/ (generator scripts) - tests/test_alignment/ → tests/test_semantic_exchange/ Updated all path references in Makefile, generator scripts, schema YAMLs, fairscape_integration, notes, and tests. All 190 tests pass. Visibility improvements: - README.md: new "D4D-Core Schema" + "Semantic Exchange Layer" sections with per-artifact path tables - docs/home.md: top-level pointers to D4D-Core and Semantic Exchange - docs/d4d_core.md: new hand-curated landing page for the core schema (artifacts, build/validate targets, curated example datasheets, class crosswalk, rationale) - docs/semantic_exchange.md: new hand-curated landing page for the exchange layer (canonical artifacts, generator scripts, validation, /d4d-add-mapping workflow, namespaces, coverage stats) - mkdocs.yml: added "D4D-Core" and "Semantic Exchange" to nav Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the chart only covered 8 hand-listed structural classes. Now it shows every d4d-core class, sorted by slot count, in a two-column layout with poster-friendly aspect (~1.84). Right-side counts: - Structural targets (Dataset/Distribution/Person/Org/Grant/etc.): full property surface (SSSOM-discovered + schema.org baseline) - Property/wrapper classes: derived by looking up which slots have the class as range, then checking the SKOS TTL for mapped targets Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- SSSOM subject_id values for the 6 new key-slot rows now use the underscore form (d4d:Class_slot) to match the SKOS TTL subjects and what generate_sssom_mapping.py emits, so downstream lookups via SSSOMIntegration.get_mappings_by_subject() resolve correctly. - SSSOM header refreshed: '# Total mappings: 107' (was 95) and '# Date: 2026-04-26'. - SKOS TTL header bumped to Version 1.1 / Date 2026-04-26 and the alignment-statistics block updated to reflect the current 112 triples (69 exact / 25 close / 10 related / 7 narrow / 1 broad) and the per-namespace counts (schema.org 57, rai 29, d4d 10, evi 7, dcat 3, rdf 1). Tests: 190 passed, 2 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill doc still pointed contributors at the pre-rename paths (src/data_sheets_schema/alignment/, data/mappings/, src/alignment/, tests/test_alignment/) so its grep, git-add, and validation snippets no longer matched the canonical files. Repointed every reference to the renamed directories. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Coverage now: 73/76 d4d-core classes (96%) and 74/77 full schema classes (96%) mapped via class-level SKOS triples. Only the abstract base classes — DatasetProperty, Information, NamedThing — remain intentionally unmapped. SKOS TTL changes (v1.2 → 189 triples, was 119): - New class-level mappings for: DataSubset, CoreDataset, CoreDatasetCollection, CoreDistribution - New sub-entity class mappings for: Person, Creator, Organization, Grantor, Grant, FundingMechanism, VariableMetadata, Software, Maintainer, DataCollector - New DatasetProperty subclass mappings for ~50 wrapper classes (Instance, SamplingStrategy, LabelingStrategy, AnnotationAnalysis, HumanSubjectResearch, InformedConsent, Deidentification, RawDataSource, ImputationProtocol, AtRiskPopulations, …) - Final gap-fill for consent workflow, ExportControlRegulatoryRestrictions, MissingInfo, ThirdPartySharing, FormatDialect SSSOM TSV: +7 class-level rows (114 total) — explicit Core* + DataSubset rows so d4d-core has its own class-level SSSOM coverage (d4d-core is the main visible / messaged data product). Structural SSSOM: +4 class-level rows for the same. Fig 5 (d4d-core) and Fig 6 (full) regenerated. Coloring scheme updated: ORANGE = in SSSOM exchange layer (mapped); BLUE = not yet mapped. Mapping detection credits a class if either it has a class-level SKOS triple OR a schema slot ranges to it with a slot-level SKOS triple. Tests: 190 passed, 2 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Horizontal bar chart of every external vocabulary referenced by the d4d-core schema (class_uri / slot_uri / *_mappings) plus SKOS-target counts from the alignment TTL. Bars colored by category: - FAIR-core (schema.org, DCTerms, DCAT) - RAI / Ethics (Croissant RAI, DUO) - Provenance / Quality (FAIRSCAPE EVI, PROV-O, AIO, QUDT, SKOS) - Domain / Internal (d4d:) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-cropped 1100px slice from each project's curated
*_human_readable.html so the poster panel shows the project name
("AI READI Dataset Documentation", etc.) plus the first section
header rather than mid-document content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds src/html/output/forced-dark.css — a dark theme override that extends datasheet-common.css with table styling for dark mode (the existing @media (prefers-color-scheme: dark) block didn't cover .data-table). Used via weasyprint's -s flag to render the four project HTML thumbnails for the poster. The dark theme matches the requested design: navy/charcoal background, purple gradient header, dark cards with rounded corners, blue-accent left border on table cells. Margins auto-trimmed in the render script so the thumbnails are tight to content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce thumb crop from 2400px back to 1750px (between original 1200 and doubled 2400) to make room for a d4d-core LinkML schema snippet PNG below each project thumbnail in the poster's records panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `annotations.d4d:section_question` to each of the 8 D4D module YAMLs holding the canonical Datasheets-for-Datasets paper question (e.g. Motivation: "For what purpose was the dataset created?"). Update the human-readable HTML renderer to read these annotations via yaml.safe_load and use them as the per-section subtitle, with the previous hardcoded strings retained only as fallbacks. This makes the schema the single source of truth for section subtitles, so the HTML and YAML can no longer drift (e.g. "Why was this dataset created?" vs "For what purpose was the dataset created?"). Re-rendered the 4 curated GC datasheets and the dark-theme poster thumbnails. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split 76 d4d-core classes into 4 quartiles (19 each) arranged 2x2 so the row labels can grow from 8pt to 18pt while keeping all classes visible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split 76 d4d-core classes into 6 groups (~13 each) arranged 2x3 to bring row labels up to 20pt (from 18pt). Drop the "→ target" suffix to fit narrower per-panel widths; the target type is captured by the D4D-core / RO-Crate column headers above each panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Robot enlarged (use width 320 → 440) with viewBox extended to fit three angled arms per side (6 arms total). - "Working indicator" gears under the robot recolored from orange to black per poster review feedback. - Bumped chest "agent" text and "D4D-Core record" header from 16pt to 20pt so all text in the figure is at least 20pt. - Tagline updated: "Beautiful, computable D4D records" → "Standardized, computable D4D records". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extended the robot head height 120 → 140 px and shifted it up by 15 px. Added a red curved headband across the top of the head with bold white "D4D" lettering. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each Heterogeneous-sources row now has 4 stick figures (a foreground "main" person plus 3 smaller / faded background figures) so the figure reads as multiple project members contributing each kind of source document. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaced the chunky 24px arms with thin 12px arms (left/right side, 3 each) explicitly pointing outward at ±45° and 0°. Recolored the chest "D4D agent" panel to match the red headband (white text on red). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Insert a LinkML-schema-validation block between the agent and the output
records. A red dashed feedback arrow ("errors → corrections") loops from
the validator back to the agent, showing iterative correction. Source-
to-robot input arrows trimmed by ~70 px so they don't intrude into the
robot's outward-pointing arms.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-routed the validation feedback path from the bottom of the agent to the headlight at the top of the antenna, and fixed the red arrow marker to userSpaceOnUse so the arrowhead is correctly sized (18x18 px) relative to the headlight instead of being scaled up 5x by stroke width. Also recolored the headlight from orange to clear red with a subtle dark-red outline to match the "red light" framing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ler boxes - Scale all fonts up so every label is at least 22 pt (sources/records headers 36, agent label 34, headband 26, project labels 24, etc.). - Robot arms restructured: LEFT side gets three arms (top points UP, middle points LEFT, bottom points DOWN); RIGHT side has a single middle arm pointing into the validator. Arms are thin (14 px). - Source rows: people enlarged (main 80→115, crowd 65→95 / 55→80) and cardboard boxes shrunk (220→170 wide, 160→120 tall) so contributors read as people, not packages. - Feedback path tightened to land squarely on the antenna's red headlight at (1075,215); removed the redundant "validated" label next to the validator (records already show green checkmarks). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strip data/poster_assets/ — figures, thumbnails, QR codes, robot SVG — that were committed in support of the Bridge2AI April 2026 F2F poster. The Google Slides deck remains the canonical artefact for poster work; none of these PNG/SVG/dot files are referenced by the schema or publication site. Schema and renderer changes that were motivated by poster work but are useful in their own right are preserved on this branch (module-level section_question annotations, the renderer reading them, the SSSOM exchange-layer expansions, and the alignment → semantic_exchange rename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore the D4D Assistant pipeline figure (was removed in previous clean-up) and tweak two details: - Left arms now point diagonally away from the body: top arm UP-LEFT (rotate +45 around the upper-left shoulder), bottom arm DOWN-LEFT (rotate -45 around the lower-left shoulder); middle arm horizontal LEFT. - "errors → corrections" label moved from y=55 down to y=145, hugging the apex of the red dashed feedback arc instead of floating high above it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch from 6 small panels (2x3, 13 classes each) to 3 long panels (1x3, ~25 classes each) so the row labels can grow to 22pt while still showing all 76 d4d-core classes. Figure aspect now ~1.25 (taller) to fill the white space available below in the exchange-layer panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the orange-vs-blue (mapped vs unmapped) coloring with a per-module color scale on both fig5 (d4d-core) and fig6 (full schema): Motivation/Composition/Collection/Preprocessing/Uses/Distribution/ Maintenance/Human/Ethics/Data_Governance get distinct hues; Variables + Base_import + structural top-level classes use neutral grays. Composition arrows are colored by the source class's module so each hub fans out as a module-colored star. All edge weights bumped +1 pt (2.5 → 3.5 for hub borders, 1.5 → 2.5 for normal borders, 1 → 2 for composition edges) so the figures read clearly at poster scale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch sfdp from splines=spline (which fell back to straight lines when nodes touched) to splines=curved (Catmull-Rom, more permissive). All edges get a B3 alpha suffix (70% opacity) so when an arrow crosses a node or another arrow it reads through instead of obscuring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the "Bridge2AI Semantic Exchange Layer — slot density across all 76 d4d-core classes ..." suptitle from the figure itself; the panel title above the figure on the poster already carries that framing. The freed vertical space lets the bars take up more of the canvas.
…in hub The dialect → FormatDialect edge was getting visually lost among the ~50 other edges out of CoreDataset. Render it solid (no alpha), thicker (4.0pt), and with a bold, larger label. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous "top 26 / next 25 / last 25 of 76" wording was confusing — "top 26 + next 25" reads as 51, not 51 of a larger set. Use explicit "ranks 1–26 of 76 (most slots)" / "ranks 27–51 of 76" / "ranks 52–76 of 76 (fewest slots)" so the totals are unambiguous. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Poster artifacts (PNG/SVG figures) belong outside this repo. Keeping them on the working tree only — not committed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-rendered AI_READI/CM4AI/VOICE curated HTMLs with current renderer (adds blue-bar styling on description fields, schema-driven section subtitles). Add the renderer's stylesheet to the nested docs path so GitHub Pages serves it co-located with the HTML files. Without this, the relative <link href="datasheet-common.css"> 404s on the deployed site and the page renders unstyled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds missing Semantic Exchange Layer coverage for newly introduced D4D classes and aligns the HTML renderer/documentation with schema-driven section subtitles, while relocating mapping artifacts to the semantic_exchange/ directory structure used by the generators and tests.
Changes:
- Extend SKOS/SSSOM alignment to cover
DatasetCollection,FileCollection,File, plus additional class/slot mappings and regenerate semantic-exchange TSVs underdata/semantic_exchange/. - Add
d4d:section_questionannotations to core D4D module YAMLs and update the human-readable renderer to source section subtitles from schema annotations. - Update docs, Make targets, and tests to use
data/semantic_exchange/and add new MkDocs pages for D4D-Core + Semantic Exchange.
Reviewed changes
Copilot reviewed 49 out of 81 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_semantic_exchange/test_sssom_validation.py | Update tests to read SSSOM artifacts from data/semantic_exchange/. |
| tests/test_semantic_exchange/init.py | Add package init/docstring for semantic-exchange tests. |
| tests/test_fairscape_integration/test_sssom_reader.py | Point real-file structural mapping load test at data/semantic_exchange/. |
| tests/test_fairscape_integration/test_sssom_integration.py | Point real-file structural mapping load test at data/semantic_exchange/. |
| src/semantic_exchange/implement_uri_mappings.py | Fix docstring paths/usage to reflect semantic-exchange directory layout. |
| src/semantic_exchange/generate_structural_mapping.py | Generate structural mapping output into data/semantic_exchange/. |
| src/semantic_exchange/generate_sssom_uri_mapping.py | Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/. |
| src/semantic_exchange/generate_sssom_mapping.py | Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/. |
| src/semantic_exchange/generate_comprehensive_sssom_uri.py | Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/. |
| src/semantic_exchange/generate_comprehensive_sssom.py | Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/. |
| src/semantic_exchange/add_slot_uris.py | Add one-shot helper to apply slot_uri recommendations into schema YAMLs. |
| src/semantic_exchange/add_module_column.py | Write SSSOM module-annotation outputs into data/semantic_exchange/. |
| src/html/output/forced-dark.css | Add a forced dark theme stylesheet intended for screenshot renders. |
| src/html/human_readable_renderer.py | Pull section subtitles from module annotations.d4d:section_question with fallbacks. |
| src/fairscape_integration/fairscape_to_d4d.py | Update default semantic SSSOM path to src/data_sheets_schema/semantic_exchange/. |
| src/fairscape_integration/README_STANDARD_TOOLING.md | Update example paths from data/mappings/ to data/semantic_exchange/. |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_uri_mapping.tsv | Add/refresh generated URI-level SSSOM mapping TSV in canonical location. |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_mapping_subset.tsv | Add/refresh subset semantic SSSOM TSV in canonical location. |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_skos_alignment.ttl | Expand SKOS alignment triples (incl. DCAT prefix and new class/slot mappings). |
| src/data_sheets_schema/schema/data_sheets_schema_core_all.yaml | Update see_also SKOS alignment path to new semantic-exchange location. |
| src/data_sheets_schema/schema/data_sheets_schema_core.yaml | Update see_also SKOS alignment path to new semantic-exchange location. |
| src/data_sheets_schema/schema/D4D_Uses.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Preprocessing.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Motivation.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Maintenance.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Human.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Distribution.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Core.yaml | Update see_also SKOS alignment path to new semantic-exchange location. |
| src/data_sheets_schema/schema/D4D_Composition.yaml | Add d4d:section_question annotation for renderer subtitle. |
| src/data_sheets_schema/schema/D4D_Collection.yaml | Add d4d:section_question annotation for renderer subtitle. |
| notes/SEMANTIC_EXCHANGE_IMPLEMENTATION.md | Update referenced semantic-exchange artifact paths. |
| mkdocs.yml | Add nav entries for new d4d_core.md and semantic_exchange.md pages. |
| docs/semantic_exchange.md | New docs page describing the Semantic Exchange Layer artifacts and workflows. |
| docs/html_output/concatenated/curated/datasheet-common.css | Add co-located common stylesheet for curated HTML outputs. |
| docs/home.md | Add D4D-Core + Semantic Exchange sections and update repo structure links. |
| docs/d4d_core.md | New docs page introducing the D4D-Core schema and curated examples. |
| data/semantic_exchange/uri_mapping_recommendations.md | Add URI mapping recommendation write-up under semantic-exchange data dir. |
| data/semantic_exchange/d4d_rocrate_structural_mapping_summary.md | Add generated structural mapping summary markdown. |
| data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv | Add new mapping rows (incl. DatasetCollection/FileCollection/File and class-level rows). |
| data/semantic_exchange/d4d_rocrate_sssom_uri_mapping.tsv | Add/refresh generated URI-level mapping TSV under data dir. |
| data/semantic_exchange/d4d_rocrate_sssom_mapping_subset.tsv | Add/refresh subset semantic SSSOM TSV under data dir. |
| data/semantic_exchange/STRUCTURAL_MAPPING_ANALYSIS.md | Update script/path references to semantic-exchange generator + output dirs. |
| data/semantic_exchange/README.md | New README describing semantic-exchange data directory contents. |
| data/poster_assets/figures/fig6_d4d_full.dot | Remove poster-asset source artifact (DOT). |
| data/poster_assets/figures/fig5_d4d_core_schema.mmd | Remove poster-asset source artifact (Mermaid). |
| data/poster_assets/figures/fig5_d4d_core_full.mmd | Remove poster-asset source artifact (Mermaid). |
| data/poster_assets/figures/fig5_d4d_core_full.dot | Remove poster-asset source artifact (DOT). |
| data/poster_assets/figures/fig3_pipeline_assistant.mmd | Remove poster-asset source artifact (Mermaid). |
| data/poster_assets/figures/fig3_humans_robot.svg | Remove poster-asset source artifact (SVG). |
| data/poster_assets/figures/fig2_rocrate_bridge.mmd | Remove poster-asset source artifact (Mermaid). |
| data/poster_assets/figures/fig1_semantic_exchange.mmd | Remove poster-asset source artifact (Mermaid). |
| data/poster_assets/figures/fig6_d4d_full.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig5_d4d_core_schema_v3.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig5_d4d_core_schema.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig5_d4d_core_full.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig3_pipeline_assistant.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig3_humans_robot.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig2_rocrate_bridge.png | Add/update poster figure raster asset. |
| data/poster_assets/figures/fig1_semantic_exchange.png | Add/update poster figure raster asset. |
| data/poster_assets/d4d_core_thumbnails/qr_repo.png | Add/update thumbnail/QR asset. |
| data/poster_assets/d4d_core_thumbnails/qr_docs.png | Add/update thumbnail/QR asset. |
| data/poster_assets/d4d_core_thumbnails/VOICE_thumb.png | Add/update thumbnail asset. |
| data/poster_assets/d4d_core_thumbnails/CM4AI_thumb.png | Add/update thumbnail asset. |
| data/poster_assets/d4d_core_thumbnails/CHORUS_thumb.png | Add/update thumbnail asset. |
| data/poster_assets/d4d_core_thumbnails/AI_READI_thumb.png | Add/update thumbnail asset. |
| README.md | Document D4D-Core entry point + semantic exchange artifacts and test/build commands. |
| Makefile | Repoint SSSOM generator targets and artifact paths to semantic-exchange directories. |
| .claude/commands/d4d-add-mapping.md | Add a Claude Code skill document for adding new exchange-layer mappings. |
| .claude/commands/README.md | List new /d4d-add-mapping command in Claude commands README. |
Comments suppressed due to low confidence (1)
data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv:160
- These new rows are class-level mappings (subject_id lacks the usual
d4d:Class/slotshape used throughout this structural mapping file). If downstream tooling assumes structural mappings are slot-level, these entries can create ambiguity. Consider moving class-level alignments to the semantic SSSOM/SKOS TTL (and keeping this file slot-level), or adopt a consistent convention (e.g., a reserved pseudo-slot) and update any readers/docs accordingly.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1. src/html/human_readable_renderer.py: open the module YAML with explicit encoding='utf-8' so non-ASCII section_question text decodes consistently across platforms (matches the convention used in render_yaml_file). 2. src/html/output/forced-dark.css: remove. Was added for one-off poster screenshots and not referenced anywhere — the renderer/CLI doesn't accept a CSS-selection flag, so the file was dead weight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict resolution: - Canonical SSSOM/SKOS files at src/data_sheets_schema/semantic_exchange/: kept ours (114-row mapping, plus 7 d4d-core class additions on top of the 107-row baseline that PR #147 already shipped, plus expanded SKOS TTL). - Mapping TSVs duplicated under data/semantic_exchange/: deleted. PR #148 (Name cleanup) already moved them to the canonical src/data_sheets_schema/semantic_exchange/ location. - Poster figures added by main (fig7_rocrate_profile.{dot,png}, fig8_exchange_butterfly.png): removed per project rule that poster artifacts don't get committed here. - README + test_sssom_validation.py: took main's version (correctly reflects the post-#148 structural/canonical split). - docs/html_output/concatenated/curated/*.html re-rendered from current renderer + curated YAMLs (generated, not hand-merged). - data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv: kept ours (superset of main). Tests: tests.test_semantic_exchange.test_sssom_validation passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Dataset,DatasetCollection,File,FileCollection.DatasetCollectiongets a dual mapping (RO-Crate root "./" →schema:Dataset, plusdcat:Catalogclose match).FileCollectionmaps to nested RO-CrateDataset/dcat:Distribution.Filemaps toschema:MediaObject/schema:DigitalDocument. Adds class-level + key-slot rows in both the SKOS TTL (d4d_rocrate_skos_alignment.ttl) and the SSSOM TSV (d4d_rocrate_sssom_mapping.tsv); regenerated derivative TSVs indata/semantic_exchange/.d4d:section_questionannotations to 8 D4D modules (Motivation,Composition,Collection,Preprocessing,Uses,Distribution,Maintenance,Human) carrying the canonical Gebru-style section questions.src/html/human_readable_renderer.pynow sources section subtitles from thosed4d:section_questionannotations rather than a hardcoded dict, so subtitle drift between schema and HTML is eliminated. Adds aforced-dark.cssfor dark-mode renders.tests/test_semantic_exchange/test_sssom_validation.pyupdated for the new mapping rows.docs/html_output/concatenated/curated/with the new renderer (now showing schema-annotation subtitles, blue-bar long-description styling, and shippingdatasheet-common.cssco-located with the HTML so GitHub Pages serves it correctly).Test plan
poetry run pytest tests/test_semantic_exchange/test_sssom_validation.py -vpassespoetry run python -c \"from data_sheets_schema.utils.sssom_integration import SSSOMIntegration; s=SSSOMIntegration('src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_mapping.tsv'); print(len(s.msdf.df))\"returns the expected mapping countDataset,DatasetCollection,File,FileCollectionall appear in the subject_id list of the regenerated SSSOMmake gen-project && make test-modulesvalidates cleanly🤖 Generated with Claude Code