Skip to content

sitemap a11y improvements#5312

Draft
StephDriver wants to merge 8 commits into
masterfrom
b-5170_a11y_sitemaps
Draft

sitemap a11y improvements#5312
StephDriver wants to merge 8 commits into
masterfrom
b-5170_a11y_sitemaps

Conversation

@StephDriver
Copy link
Copy Markdown
Member

closes #5170

@StephDriver
Copy link
Copy Markdown
Member Author

StephDriver commented May 18, 2026

Related Fixes

While working on this, I found the following bugs and elements that were confusing to users and fixed them along the way.

cms/index.html

  • for clarity: added 'in navigation' column to page table, that makes clear if the custom page is being used in the navigation. This is important because only pages used in the navigation are picked up by the sitemaps. And the navigation can refer to press pages or journal pages, but only the custom journal pages will show up in the journal cms/index.html, so there can be confusion if the custom nav includes a link to a page of hte same name, but on the press not the journal.

note: this follows the explanation given on the cms/nav.html page where it describes how to enter navigation links

In most cases, this should be the the relative URL path to the page. The relative path is formed from 1) the journal code (acronym or abbreviation) if your journal homepage URL ends with the journal code, 2) the word “site”, and 3) whatever you put into the Link field for the corresponding page. For example, to link to a custom page you created, if the journal homepage URL is “example.com/abc”, and you put “research-integrity” in the Link field for the page, then the Link field for the nav item should be “abc/site/research-integrity”. For top-level nav items that should not also appear as sub-nav items (under themselves), leave the Link field empty. For external links, it should be an absolute URL.

  • fix: the page table includes the relative link, but this did not include the journal code prefix, so the links given were as if on the press, rather than the link needed in the nav to reach that page.
  • TODO fix: the action buttons in the Navigation pane were all over the place. I aligned them.
  • TODO clarity: the action buttons in the Navigation pane included "enable" in green and "disable" in yellow - which I initially misunderstood as green items being 'enabled' and yellow items 'disabled' - when the button refers to the action you can do, rather than the current state. I have changed this to make it clearer which items are active and which not (and updated the language).

Key Features.

Front of house footer links to Sitemaps

These use a template_tag to go to the most relevant sitemap. Most of this is intuitive:

  • article -> issue sitemap from which the article is linked. Any article not in an issue links to a virtual 'no issue' issue.
  • issue -> journal index from which the issue sitemap is linked. This is the primary issue. Collections are not regarded as issues and articles are only listed for their primary issue.
  • static page, e.g. Accessibility -> Pages sitemap from which it is linked except Home Page, see below
  • news item -> News sitemap from which it is linked.
  • repository subject -> while preprints could have multiple subjects, in keeping with the issue:article 1:1 relationship, they are listed on their first subject.

But some less so...

  • home page -> the siteindex for the press/journal/repo NOT the pages sitemap. Yes it is a page, but if you're on the home page, then the sitemap for the whole of that object is the most relevant.

Which Static Pages?

Always-present

  • Home,
  • Accessibility,
  • Log in,
  • Register,
  • Privacy Policy (the url is a setting, so must be added that way).

These appear regardless of any nav flag. for example the 'home' page can be disabled in the custom navigation, but unlike other disabled pages, would always exist!

Built-in nav pages (journal-only)

mirrors each nav_* flag on the Journal model:

  • nav_contact → Contact,
  • nav_articles → Articles,
  • nav_issues → Issues,
  • nav_news → News (gated on has_news, see below),
  • nav_sub → Submissions, nav_start → Start Submission (also checks disable_journal_submission),
  • nav_review → Become a Reviewer.

Plus journal settings:

  • enable_editorial_display → Editorial Team (with multi-page support),
  • keyword_list_page → Keywords.

CMS custom nav pages

NavigationItems with /site/ links (flat nav and subnav treated identically), plus non-CMS custom nav items.
/site/page links to press site,
code/site/page links to journal site
Journals may include press links in their custom nav, even if they have pages of same name in the CMS.

News Gate

News suppression post-processing still applies to both press and journal to handle any custom nav items pointing to /news/ when no items exist.

Handling of CMS pages

CMS pages are only included if they are in the Navigation. This is because we want to use the Navigation + Index pages (e.g. Issues) with the Sitemap to demonstrate compliance with WCAG Multiple Ways . So if a CMS page isn't in the navigation, then it isn't in use. Note this is how some installs are using CMS pages in order to have draft or backup pages by not putting them in the navigation.

To make this clearer in the manager, I have included an additional column next to each CMS page, that shows where it is in the navigation (or that it isn't there).

CMS pages that have no content, will 404, so these are also filtered out by the sitemap.

The CMS page name is not used in the sitemap as the label for the link, instead the label in the navigation is used, this is for consistency as it is the label presented to users.

Handling of Nav links

External links are ignored, as they are not for the site.

For journals and repositories, press links are ignored, as they will be on the press site map

Orphaned Nav pages are also ignored, this is where a page was added as a sub-navigation, then the parent navigation item was deleted, but the sub-nav itself was not deleted. These do not show up in the navigation, but will show up in a database query, so are specifically filtered out.

Ensuring Unique Contextual LInks

For WCAG compliance, it is important that link text is unique on the page. We handle this in two stages. First each URL should only appear once on a page (deduplication) and second where different URLs have the same text label, we disambiguate them by adding [different text] as a suffix. There are several different types of disambiguation in the code, to handle press and journals having the same name, issues of same name, articles of same name, news items of same name etc. Ultimately, the end goal is that all links on the page have unique text.

Validation

As per Sitemaps.org Protocol. This required a change in the structure of the sitemaps from what had been agreed during Backlog refinement. We cannot have a mix of links and child sitemaps, so the static links have been moved into their own child sitemap.

Testing

Added a [redacted] tonne of tests - to guard against duplicates or missing items in the sitemaps in future.

Docs

  • updated sitemaps-and-robots.md developer doc.
  • TODO update the userfacing docs to explain which pages will show up
  • TODO create issue to re-audit for WCAG multiple ways compliance

@StephDriver StephDriver force-pushed the b-5170_a11y_sitemaps branch from 77a482e to ea56f0e Compare May 19, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Front of House Sitemaps

1 participant