Skip to content

Add SSSOM mappings for Dataset/DatasetCollection/File/FileCollection + renderer subtitle annotations#149

Merged
realmarcin merged 35 commits into
mainfrom
update_exchange2
Apr 29, 2026
Merged

Add SSSOM mappings for Dataset/DatasetCollection/File/FileCollection + renderer subtitle annotations#149
realmarcin merged 35 commits into
mainfrom
update_exchange2

Conversation

@realmarcin
Copy link
Copy Markdown
Collaborator

Summary

  • Semantic exchange layer — add SSSOM/SKOS mappings for the four classes that were added to D4D after the original exchange layer was authored: Dataset, DatasetCollection, File, FileCollection. DatasetCollection gets a dual mapping (RO-Crate root "./" → schema:Dataset, plus dcat:Catalog close match). FileCollection maps to nested RO-Crate Dataset / dcat:Distribution. File maps to schema:MediaObject / schema:DigitalDocument. Adds class-level + key-slot rows in both the SKOS TTL (d4d_rocrate_skos_alignment.ttl) and the SSSOM TSV (d4d_rocrate_sssom_mapping.tsv); regenerated derivative TSVs in data/semantic_exchange/.
  • Schema annotations — add d4d:section_question annotations to 8 D4D modules (Motivation, Composition, Collection, Preprocessing, Uses, Distribution, Maintenance, Human) carrying the canonical Gebru-style section questions.
  • Renderersrc/html/human_readable_renderer.py now sources section subtitles from those d4d:section_question annotations rather than a hardcoded dict, so subtitle drift between schema and HTML is eliminated. Adds a forced-dark.css for dark-mode renders.
  • Teststests/test_semantic_exchange/test_sssom_validation.py updated for the new mapping rows.
  • Curated HTMLs — re-rendered the 4 curated D4D datasheets in docs/html_output/concatenated/curated/ with the new renderer (now showing schema-annotation subtitles, blue-bar long-description styling, and shipping datasheet-common.css co-located with the HTML so GitHub Pages serves it correctly).

Test plan

  • poetry run pytest tests/test_semantic_exchange/test_sssom_validation.py -v passes
  • poetry run python -c \"from data_sheets_schema.utils.sssom_integration import SSSOMIntegration; s=SSSOMIntegration('src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_mapping.tsv'); print(len(s.msdf.df))\" returns the expected mapping count
  • Dataset, DatasetCollection, File, FileCollection all appear in the subject_id list of the regenerated SSSOM
  • Live deploy at https://bridge2ai.github.io/data-sheets-schema/html_output/concatenated/curated/AI_READI_human_readable.html renders with the gradient header and blue-bar descriptions
  • make gen-project && make test-modules validates cleanly

🤖 Generated with Claude Code

realmarcin and others added 30 commits April 25, 2026 18:23
These four classes were added to the D4D schema after the original
semantic exchange layer was authored, leaving them without RO-Crate
mappings. This commit closes that gap.

Semantic SSSOM (src/data_sheets_schema/alignment/d4d_rocrate_sssom_mapping.tsv):
  +12 rows (95 → 107)
  - DatasetCollection → schema:Dataset (exactMatch, RO-Crate root)
  - DatasetCollection → dcat:Catalog (closeMatch, semantic-catalog view)
  - File → schema:MediaObject (exactMatch)
  - File → schema:DigitalDocument (closeMatch)
  - FileCollection → schema:Dataset (exactMatch, nested in hasPart)
  - FileCollection → dcat:Distribution (closeMatch)
  - 6 key-slot rows: DatasetCollection.resources/FileCollection.resources →
    schema:hasPart, File.file_type → d4d:fileType, FileCollection.{collection_type,
    file_count, total_bytes} → d4d:collectionType / d4d:fileCount / dcat:byteSize

Structural SSSOM (data/mappings/d4d_rocrate_structural_mapping.sssom.tsv):
  +6 rows (149 → 155) — slot-level rows mirroring the semantic-file slots

SKOS alignment (src/data_sheets_schema/alignment/d4d_rocrate_skos_alignment.ttl):
  - Added dcat: prefix declaration
  - Added 6 class-level + 6 slot-level skos triples mirroring the SSSOM rows

Per the user's note that DatasetCollection may be the RO-Crate root
(@type=["Dataset", "https://w3id.org/EVI#ROCrate"], @id="./"),
DatasetCollection is given a dual mapping: exactMatch → schema:Dataset
(root semantics) and closeMatch → dcat:Catalog (semantic-catalog view).

Out of scope for this PR (existing TODOs remain):
  - src/fairscape_integration/d4d_to_fairscape.py:292-295 — converter
    code does not yet traverse FileCollection.resources to emit RO-Crate
    File entities. The mapping layer is now ready; converter update is
    a separate follow-up.
  - The generated comprehensive/uri SSSOM variants weren't regenerated;
    the canonical files (semantic + structural) are the source of truth.

Validation:
  - SSSOMIntegration parses both files (semantic via custom reader,
    structural via sssom-py per the existing column-naming setup)
  - All 190 tests in tests/test_alignment + tests/test_fairscape_integration pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A reusable Claude Code slash command that captures the workflow used in
this PR — adding D4D ↔ RO-Crate / FAIRSCAPE mappings for new schema
classes. The skill:

- Describes the 19-column semantic SSSOM and 17-column structural SSSOM
  layouts and points at the canonical files
- Provides a decision rubric for choosing primary/secondary RO-Crate
  targets based on class_uri / exact_mappings / tree_root annotations
- Includes row templates and a Python helper-script skeleton
- Documents standard RO-Crate target conventions (root Dataset,
  schema:MediaObject, dcat:Catalog, schema:hasPart, etc.)
- Specifies the mandatory validation step via SSSOMIntegration + pytest
- Codifies branch / commit / PR conventions
- Calls out known follow-ups to keep out of scope (converter TODOs,
  generator regen, schema YAML touch-ups)

Cross-references PR #147 as the canonical worked example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generated from the D4D ↔ RO-Crate semantic SSSOM by parsing rocrate_json_path
patterns to extract entity types and their properties. Shows:
- Dataset (root) with properties grouped by namespace (schema.org, DCAT,
  FAIRSCAPE EVI, Croissant RAI, D4D-specific)
- Sub-entities: MediaObject, Person, Organization, Grant, CreativeWork,
  DefinedTerm
- Reference edges (author/creator/contributor → Person, funder → Grant,
  publisher → Organization, citation → CreativeWork, about → DefinedTerm,
  hasPart → MediaObject)
- ROCrate as root marker connected via dashed @type edge

Generator: src/alignment/ (helper script captured in /tmp during this PR);
rendered with graphviz dot -Gdpi=180.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-class side-by-side comparison of slot counts in the d4d-core
semantic exchange layer (left, orange) versus mapped/standard
RO-Crate properties on the corresponding target type (right, green).

Right-side counts combine SSSOM-discovered properties with the
schema.org / RO-Crate 1.1 baseline for sub-entity types
(Person, Organization, Grant, MediaObject, Distribution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… site coverage

- src/data_sheets_schema/alignment/ → src/data_sheets_schema/semantic_exchange/
  (canonical SKOS TTL + semantic SSSOM artifacts)
- data/mappings/ → data/semantic_exchange/
  (sssom-py-compatible structural mapping + analysis docs)
- src/alignment/ → src/semantic_exchange/  (generator scripts)
- tests/test_alignment/ → tests/test_semantic_exchange/

Updated all path references in Makefile, generator scripts, schema YAMLs,
fairscape_integration, notes, and tests. All 190 tests pass.

Visibility improvements:
- README.md: new "D4D-Core Schema" + "Semantic Exchange Layer" sections
  with per-artifact path tables
- docs/home.md: top-level pointers to D4D-Core and Semantic Exchange
- docs/d4d_core.md: new hand-curated landing page for the core schema
  (artifacts, build/validate targets, curated example datasheets, class
  crosswalk, rationale)
- docs/semantic_exchange.md: new hand-curated landing page for the
  exchange layer (canonical artifacts, generator scripts, validation,
  /d4d-add-mapping workflow, namespaces, coverage stats)
- mkdocs.yml: added "D4D-Core" and "Semantic Exchange" to nav

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the chart only covered 8 hand-listed structural classes.
Now it shows every d4d-core class, sorted by slot count, in a
two-column layout with poster-friendly aspect (~1.84).

Right-side counts:
- Structural targets (Dataset/Distribution/Person/Org/Grant/etc.):
  full property surface (SSSOM-discovered + schema.org baseline)
- Property/wrapper classes: derived by looking up which slots have
  the class as range, then checking the SKOS TTL for mapped targets

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- SSSOM subject_id values for the 6 new key-slot rows now use the
  underscore form (d4d:Class_slot) to match the SKOS TTL subjects
  and what generate_sssom_mapping.py emits, so downstream lookups
  via SSSOMIntegration.get_mappings_by_subject() resolve correctly.
- SSSOM header refreshed: '# Total mappings: 107' (was 95) and
  '# Date: 2026-04-26'.
- SKOS TTL header bumped to Version 1.1 / Date 2026-04-26 and the
  alignment-statistics block updated to reflect the current 112
  triples (69 exact / 25 close / 10 related / 7 narrow / 1 broad)
  and the per-namespace counts (schema.org 57, rai 29, d4d 10,
  evi 7, dcat 3, rdf 1).

Tests: 190 passed, 2 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill doc still pointed contributors at the pre-rename paths
(src/data_sheets_schema/alignment/, data/mappings/, src/alignment/,
tests/test_alignment/) so its grep, git-add, and validation snippets
no longer matched the canonical files. Repointed every reference to
the renamed directories.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Coverage now: 73/76 d4d-core classes (96%) and 74/77 full schema
classes (96%) mapped via class-level SKOS triples. Only the abstract
base classes — DatasetProperty, Information, NamedThing — remain
intentionally unmapped.

SKOS TTL changes (v1.2 → 189 triples, was 119):
- New class-level mappings for: DataSubset, CoreDataset,
  CoreDatasetCollection, CoreDistribution
- New sub-entity class mappings for: Person, Creator, Organization,
  Grantor, Grant, FundingMechanism, VariableMetadata, Software,
  Maintainer, DataCollector
- New DatasetProperty subclass mappings for ~50 wrapper classes
  (Instance, SamplingStrategy, LabelingStrategy, AnnotationAnalysis,
  HumanSubjectResearch, InformedConsent, Deidentification,
  RawDataSource, ImputationProtocol, AtRiskPopulations, …)
- Final gap-fill for consent workflow, ExportControlRegulatoryRestrictions,
  MissingInfo, ThirdPartySharing, FormatDialect

SSSOM TSV: +7 class-level rows (114 total) — explicit Core* +
DataSubset rows so d4d-core has its own class-level SSSOM coverage
(d4d-core is the main visible / messaged data product).

Structural SSSOM: +4 class-level rows for the same.

Fig 5 (d4d-core) and Fig 6 (full) regenerated. Coloring scheme
updated: ORANGE = in SSSOM exchange layer (mapped); BLUE = not yet
mapped. Mapping detection credits a class if either it has a
class-level SKOS triple OR a schema slot ranges to it with a
slot-level SKOS triple.

Tests: 190 passed, 2 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Horizontal bar chart of every external vocabulary referenced by the
d4d-core schema (class_uri / slot_uri / *_mappings) plus SKOS-target
counts from the alignment TTL. Bars colored by category:

- FAIR-core (schema.org, DCTerms, DCAT)
- RAI / Ethics (Croissant RAI, DUO)
- Provenance / Quality (FAIRSCAPE EVI, PROV-O, AIO, QUDT, SKOS)
- Domain / Internal (d4d:)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-cropped 1100px slice from each project's curated
*_human_readable.html so the poster panel shows the project name
("AI READI Dataset Documentation", etc.) plus the first section
header rather than mid-document content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds src/html/output/forced-dark.css — a dark theme override that
extends datasheet-common.css with table styling for dark mode (the
existing @media (prefers-color-scheme: dark) block didn't cover
.data-table). Used via weasyprint's -s flag to render the four
project HTML thumbnails for the poster.

The dark theme matches the requested design: navy/charcoal
background, purple gradient header, dark cards with rounded corners,
blue-accent left border on table cells. Margins auto-trimmed in the
render script so the thumbnails are tight to content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce thumb crop from 2400px back to 1750px (between original 1200
and doubled 2400) to make room for a d4d-core LinkML schema snippet
PNG below each project thumbnail in the poster's records panel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `annotations.d4d:section_question` to each of the 8 D4D module YAMLs
holding the canonical Datasheets-for-Datasets paper question (e.g.
Motivation: "For what purpose was the dataset created?"). Update the
human-readable HTML renderer to read these annotations via yaml.safe_load
and use them as the per-section subtitle, with the previous hardcoded
strings retained only as fallbacks.

This makes the schema the single source of truth for section subtitles,
so the HTML and YAML can no longer drift (e.g. "Why was this dataset
created?" vs "For what purpose was the dataset created?"). Re-rendered
the 4 curated GC datasheets and the dark-theme poster thumbnails.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split 76 d4d-core classes into 4 quartiles (19 each) arranged 2x2 so the
row labels can grow from 8pt to 18pt while keeping all classes visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split 76 d4d-core classes into 6 groups (~13 each) arranged 2x3 to bring
row labels up to 20pt (from 18pt). Drop the "→ target" suffix to fit
narrower per-panel widths; the target type is captured by the
D4D-core / RO-Crate column headers above each panel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Robot enlarged (use width 320 → 440) with viewBox extended to fit
  three angled arms per side (6 arms total).
- "Working indicator" gears under the robot recolored from orange to
  black per poster review feedback.
- Bumped chest "agent" text and "D4D-Core record" header from 16pt to
  20pt so all text in the figure is at least 20pt.
- Tagline updated: "Beautiful, computable D4D records" →
  "Standardized, computable D4D records".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extended the robot head height 120 → 140 px and shifted it up by 15 px.
Added a red curved headband across the top of the head with bold white
"D4D" lettering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each Heterogeneous-sources row now has 4 stick figures (a foreground
"main" person plus 3 smaller / faded background figures) so the figure
reads as multiple project members contributing each kind of source
document.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaced the chunky 24px arms with thin 12px arms (left/right side, 3
each) explicitly pointing outward at ±45° and 0°. Recolored the chest
"D4D agent" panel to match the red headband (white text on red).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Insert a LinkML-schema-validation block between the agent and the output
records. A red dashed feedback arrow ("errors → corrections") loops from
the validator back to the agent, showing iterative correction. Source-
to-robot input arrows trimmed by ~70 px so they don't intrude into the
robot's outward-pointing arms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-routed the validation feedback path from the bottom of the agent
to the headlight at the top of the antenna, and fixed the red arrow
marker to userSpaceOnUse so the arrowhead is correctly sized (18x18 px)
relative to the headlight instead of being scaled up 5x by stroke
width. Also recolored the headlight from orange to clear red with a
subtle dark-red outline to match the "red light" framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ler boxes

- Scale all fonts up so every label is at least 22 pt (sources/records
  headers 36, agent label 34, headband 26, project labels 24, etc.).
- Robot arms restructured: LEFT side gets three arms (top points UP,
  middle points LEFT, bottom points DOWN); RIGHT side has a single
  middle arm pointing into the validator. Arms are thin (14 px).
- Source rows: people enlarged (main 80→115, crowd 65→95 / 55→80) and
  cardboard boxes shrunk (220→170 wide, 160→120 tall) so contributors
  read as people, not packages.
- Feedback path tightened to land squarely on the antenna's red
  headlight at (1075,215); removed the redundant "validated" label
  next to the validator (records already show green checkmarks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strip data/poster_assets/ — figures, thumbnails, QR codes, robot SVG —
that were committed in support of the Bridge2AI April 2026 F2F poster.
The Google Slides deck remains the canonical artefact for poster work;
none of these PNG/SVG/dot files are referenced by the schema or
publication site.

Schema and renderer changes that were motivated by poster work but are
useful in their own right are preserved on this branch (module-level
section_question annotations, the renderer reading them, the SSSOM
exchange-layer expansions, and the alignment → semantic_exchange
rename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore the D4D Assistant pipeline figure (was removed in previous
clean-up) and tweak two details:

- Left arms now point diagonally away from the body: top arm UP-LEFT
  (rotate +45 around the upper-left shoulder), bottom arm DOWN-LEFT
  (rotate -45 around the lower-left shoulder); middle arm horizontal
  LEFT.
- "errors → corrections" label moved from y=55 down to y=145, hugging
  the apex of the red dashed feedback arc instead of floating high
  above it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch from 6 small panels (2x3, 13 classes each) to 3 long panels
(1x3, ~25 classes each) so the row labels can grow to 22pt while still
showing all 76 d4d-core classes. Figure aspect now ~1.25 (taller)
to fill the white space available below in the exchange-layer panel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the orange-vs-blue (mapped vs unmapped) coloring with a
per-module color scale on both fig5 (d4d-core) and fig6 (full schema):
Motivation/Composition/Collection/Preprocessing/Uses/Distribution/
Maintenance/Human/Ethics/Data_Governance get distinct hues; Variables
+ Base_import + structural top-level classes use neutral grays.

Composition arrows are colored by the source class's module so each
hub fans out as a module-colored star. All edge weights bumped +1 pt
(2.5 → 3.5 for hub borders, 1.5 → 2.5 for normal borders, 1 → 2 for
composition edges) so the figures read clearly at poster scale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch sfdp from splines=spline (which fell back to straight lines
when nodes touched) to splines=curved (Catmull-Rom, more permissive).
All edges get a B3 alpha suffix (70% opacity) so when an arrow crosses
a node or another arrow it reads through instead of obscuring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the "Bridge2AI Semantic Exchange Layer — slot density across all
76 d4d-core classes ..." suptitle from the figure itself; the panel
title above the figure on the poster already carries that framing.
The freed vertical space lets the bars take up more of the canvas.
…in hub

The dialect → FormatDialect edge was getting visually lost among the ~50
other edges out of CoreDataset. Render it solid (no alpha), thicker
(4.0pt), and with a bold, larger label.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
realmarcin and others added 3 commits April 27, 2026 11:17
The previous "top 26 / next 25 / last 25 of 76" wording was confusing —
"top 26 + next 25" reads as 51, not 51 of a larger set. Use explicit
"ranks 1–26 of 76 (most slots)" / "ranks 27–51 of 76" / "ranks 52–76 of
76 (fewest slots)" so the totals are unambiguous.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Poster artifacts (PNG/SVG figures) belong outside this repo. Keeping
them on the working tree only — not committed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-rendered AI_READI/CM4AI/VOICE curated HTMLs with current renderer
(adds blue-bar styling on description fields, schema-driven section
subtitles).

Add the renderer's stylesheet to the nested docs path so GitHub Pages
serves it co-located with the HTML files. Without this, the relative
<link href="datasheet-common.css"> 404s on the deployed site and the
page renders unstyled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 29, 2026 03:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds missing Semantic Exchange Layer coverage for newly introduced D4D classes and aligns the HTML renderer/documentation with schema-driven section subtitles, while relocating mapping artifacts to the semantic_exchange/ directory structure used by the generators and tests.

Changes:

  • Extend SKOS/SSSOM alignment to cover DatasetCollection, FileCollection, File, plus additional class/slot mappings and regenerate semantic-exchange TSVs under data/semantic_exchange/.
  • Add d4d:section_question annotations to core D4D module YAMLs and update the human-readable renderer to source section subtitles from schema annotations.
  • Update docs, Make targets, and tests to use data/semantic_exchange/ and add new MkDocs pages for D4D-Core + Semantic Exchange.

Reviewed changes

Copilot reviewed 49 out of 81 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_semantic_exchange/test_sssom_validation.py Update tests to read SSSOM artifacts from data/semantic_exchange/.
tests/test_semantic_exchange/init.py Add package init/docstring for semantic-exchange tests.
tests/test_fairscape_integration/test_sssom_reader.py Point real-file structural mapping load test at data/semantic_exchange/.
tests/test_fairscape_integration/test_sssom_integration.py Point real-file structural mapping load test at data/semantic_exchange/.
src/semantic_exchange/implement_uri_mappings.py Fix docstring paths/usage to reflect semantic-exchange directory layout.
src/semantic_exchange/generate_structural_mapping.py Generate structural mapping output into data/semantic_exchange/.
src/semantic_exchange/generate_sssom_uri_mapping.py Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/.
src/semantic_exchange/generate_sssom_mapping.py Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/.
src/semantic_exchange/generate_comprehensive_sssom_uri.py Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/.
src/semantic_exchange/generate_comprehensive_sssom.py Update default SKOS/SSSOM output paths to src/data_sheets_schema/semantic_exchange/.
src/semantic_exchange/add_slot_uris.py Add one-shot helper to apply slot_uri recommendations into schema YAMLs.
src/semantic_exchange/add_module_column.py Write SSSOM module-annotation outputs into data/semantic_exchange/.
src/html/output/forced-dark.css Add a forced dark theme stylesheet intended for screenshot renders.
src/html/human_readable_renderer.py Pull section subtitles from module annotations.d4d:section_question with fallbacks.
src/fairscape_integration/fairscape_to_d4d.py Update default semantic SSSOM path to src/data_sheets_schema/semantic_exchange/.
src/fairscape_integration/README_STANDARD_TOOLING.md Update example paths from data/mappings/ to data/semantic_exchange/.
src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_uri_mapping.tsv Add/refresh generated URI-level SSSOM mapping TSV in canonical location.
src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_mapping_subset.tsv Add/refresh subset semantic SSSOM TSV in canonical location.
src/data_sheets_schema/semantic_exchange/d4d_rocrate_skos_alignment.ttl Expand SKOS alignment triples (incl. DCAT prefix and new class/slot mappings).
src/data_sheets_schema/schema/data_sheets_schema_core_all.yaml Update see_also SKOS alignment path to new semantic-exchange location.
src/data_sheets_schema/schema/data_sheets_schema_core.yaml Update see_also SKOS alignment path to new semantic-exchange location.
src/data_sheets_schema/schema/D4D_Uses.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Preprocessing.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Motivation.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Maintenance.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Human.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Distribution.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Core.yaml Update see_also SKOS alignment path to new semantic-exchange location.
src/data_sheets_schema/schema/D4D_Composition.yaml Add d4d:section_question annotation for renderer subtitle.
src/data_sheets_schema/schema/D4D_Collection.yaml Add d4d:section_question annotation for renderer subtitle.
notes/SEMANTIC_EXCHANGE_IMPLEMENTATION.md Update referenced semantic-exchange artifact paths.
mkdocs.yml Add nav entries for new d4d_core.md and semantic_exchange.md pages.
docs/semantic_exchange.md New docs page describing the Semantic Exchange Layer artifacts and workflows.
docs/html_output/concatenated/curated/datasheet-common.css Add co-located common stylesheet for curated HTML outputs.
docs/home.md Add D4D-Core + Semantic Exchange sections and update repo structure links.
docs/d4d_core.md New docs page introducing the D4D-Core schema and curated examples.
data/semantic_exchange/uri_mapping_recommendations.md Add URI mapping recommendation write-up under semantic-exchange data dir.
data/semantic_exchange/d4d_rocrate_structural_mapping_summary.md Add generated structural mapping summary markdown.
data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv Add new mapping rows (incl. DatasetCollection/FileCollection/File and class-level rows).
data/semantic_exchange/d4d_rocrate_sssom_uri_mapping.tsv Add/refresh generated URI-level mapping TSV under data dir.
data/semantic_exchange/d4d_rocrate_sssom_mapping_subset.tsv Add/refresh subset semantic SSSOM TSV under data dir.
data/semantic_exchange/STRUCTURAL_MAPPING_ANALYSIS.md Update script/path references to semantic-exchange generator + output dirs.
data/semantic_exchange/README.md New README describing semantic-exchange data directory contents.
data/poster_assets/figures/fig6_d4d_full.dot Remove poster-asset source artifact (DOT).
data/poster_assets/figures/fig5_d4d_core_schema.mmd Remove poster-asset source artifact (Mermaid).
data/poster_assets/figures/fig5_d4d_core_full.mmd Remove poster-asset source artifact (Mermaid).
data/poster_assets/figures/fig5_d4d_core_full.dot Remove poster-asset source artifact (DOT).
data/poster_assets/figures/fig3_pipeline_assistant.mmd Remove poster-asset source artifact (Mermaid).
data/poster_assets/figures/fig3_humans_robot.svg Remove poster-asset source artifact (SVG).
data/poster_assets/figures/fig2_rocrate_bridge.mmd Remove poster-asset source artifact (Mermaid).
data/poster_assets/figures/fig1_semantic_exchange.mmd Remove poster-asset source artifact (Mermaid).
data/poster_assets/figures/fig6_d4d_full.png Add/update poster figure raster asset.
data/poster_assets/figures/fig5_d4d_core_schema_v3.png Add/update poster figure raster asset.
data/poster_assets/figures/fig5_d4d_core_schema.png Add/update poster figure raster asset.
data/poster_assets/figures/fig5_d4d_core_full.png Add/update poster figure raster asset.
data/poster_assets/figures/fig3_pipeline_assistant.png Add/update poster figure raster asset.
data/poster_assets/figures/fig3_humans_robot.png Add/update poster figure raster asset.
data/poster_assets/figures/fig2_rocrate_bridge.png Add/update poster figure raster asset.
data/poster_assets/figures/fig1_semantic_exchange.png Add/update poster figure raster asset.
data/poster_assets/d4d_core_thumbnails/qr_repo.png Add/update thumbnail/QR asset.
data/poster_assets/d4d_core_thumbnails/qr_docs.png Add/update thumbnail/QR asset.
data/poster_assets/d4d_core_thumbnails/VOICE_thumb.png Add/update thumbnail asset.
data/poster_assets/d4d_core_thumbnails/CM4AI_thumb.png Add/update thumbnail asset.
data/poster_assets/d4d_core_thumbnails/CHORUS_thumb.png Add/update thumbnail asset.
data/poster_assets/d4d_core_thumbnails/AI_READI_thumb.png Add/update thumbnail asset.
README.md Document D4D-Core entry point + semantic exchange artifacts and test/build commands.
Makefile Repoint SSSOM generator targets and artifact paths to semantic-exchange directories.
.claude/commands/d4d-add-mapping.md Add a Claude Code skill document for adding new exchange-layer mappings.
.claude/commands/README.md List new /d4d-add-mapping command in Claude commands README.
Comments suppressed due to low confidence (1)

data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv:160

  • These new rows are class-level mappings (subject_id lacks the usual d4d:Class/slot shape used throughout this structural mapping file). If downstream tooling assumes structural mappings are slot-level, these entries can create ambiguity. Consider moving class-level alignments to the semantic SSSOM/SKOS TTL (and keeping this file slot-level), or adopt a consistent convention (e.g., a reserved pseudo-slot) and update any readers/docs accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/html/human_readable_renderer.py
Comment thread src/html/output/forced-dark.css Outdated
realmarcin and others added 2 commits April 28, 2026 21:15
1. src/html/human_readable_renderer.py: open the module YAML with
   explicit encoding='utf-8' so non-ASCII section_question text decodes
   consistently across platforms (matches the convention used in
   render_yaml_file).

2. src/html/output/forced-dark.css: remove. Was added for one-off poster
   screenshots and not referenced anywhere — the renderer/CLI doesn't
   accept a CSS-selection flag, so the file was dead weight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict resolution:
- Canonical SSSOM/SKOS files at src/data_sheets_schema/semantic_exchange/:
  kept ours (114-row mapping, plus 7 d4d-core class additions on top of
  the 107-row baseline that PR #147 already shipped, plus expanded SKOS
  TTL).
- Mapping TSVs duplicated under data/semantic_exchange/: deleted.
  PR #148 (Name cleanup) already moved them to the canonical
  src/data_sheets_schema/semantic_exchange/ location.
- Poster figures added by main (fig7_rocrate_profile.{dot,png},
  fig8_exchange_butterfly.png): removed per project rule that poster
  artifacts don't get committed here.
- README + test_sssom_validation.py: took main's version (correctly
  reflects the post-#148 structural/canonical split).
- docs/html_output/concatenated/curated/*.html re-rendered from current
  renderer + curated YAMLs (generated, not hand-merged).
- data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv:
  kept ours (superset of main).

Tests: tests.test_semantic_exchange.test_sssom_validation passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@realmarcin realmarcin merged commit e141852 into main Apr 29, 2026
3 checks passed
@realmarcin realmarcin deleted the update_exchange2 branch April 29, 2026 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants