-
Notifications
You must be signed in to change notification settings - Fork 5
Add sitemap generator and serve sitemap from S3 #1037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Robert Smith <[email protected]>
Signed-off-by: Robert Smith <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds sitemap generation functionality to the Better Together Community Engine, storing generated sitemaps in Active Storage (S3) and serving them via a dedicated controller. The implementation scopes sitemaps to individual platforms and includes background job support for automatic regeneration.
Key Changes:
- Adds Sitemap model with platform association and Active Storage file attachment
- Implements SitemapRefreshJob to regenerate sitemaps in the background
- Creates rake task and configuration for sitemap generation using sitemap_generator gem
- Adds controller endpoint to serve sitemap files from S3
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| db/migrate/20250821120000_create_better_together_sitemaps.rb | Creates sitemaps table with platform association using Better Together migration helpers |
| spec/dummy/db/schema.rb | Schema update reflecting new sitemaps table with UUID primary key and platform foreign key |
| app/models/better_together/sitemap.rb | Sitemap model with platform association, file attachment, and current finder method |
| app/models/better_together/platform.rb | Adds has_one sitemap association to Platform model |
| app/models/better_together/page.rb | Adds after_commit callback to trigger sitemap refresh on page changes |
| app/jobs/better_together/sitemap_refresh_job.rb | Background job to invoke rake task for sitemap generation |
| app/controllers/better_together/sitemaps_controller.rb | Controller to redirect to Active Storage URL for sitemap file |
| app/views/layouts/better_together/application.html.erb | Adds sitemap link tag to HTML head |
| config/sitemap.rb | Sitemap generator configuration defining URLs to include |
| config/routes.rb | Adds sitemap route outside locale scope |
| lib/tasks/sitemap.rake | Rake task for sitemap generation and asset precompile hook |
| spec/requests/better_together/sitemaps_spec.rb | Request specs for sitemap controller endpoint |
| spec/jobs/better_together/sitemap_refresh_job_spec.rb | Job specs including privacy filtering test |
| better_together.gemspec | Adds sitemap_generator dependency |
| Gemfile | Adds sitemap_generator gem |
| Gemfile.lock | Updates lock file with sitemap_generator and duplicate sidekiq entries |
| BetterTogether::Community.find_each do |community| | ||
| add helpers.community_path(community, locale: I18n.default_locale), lastmod: community.updated_at | ||
| end | ||
|
|
||
| add helpers.conversations_path(locale: I18n.default_locale) | ||
| BetterTogether::Conversation.find_each do |conversation| | ||
| add helpers.conversation_path(conversation, locale: I18n.default_locale), lastmod: conversation.updated_at | ||
| end | ||
|
|
||
| add helpers.posts_path(locale: I18n.default_locale) | ||
| BetterTogether::Post.published.find_each do |post| | ||
| add helpers.post_path(post, locale: I18n.default_locale), lastmod: post.updated_at | ||
| end | ||
|
|
||
| add helpers.events_path(locale: I18n.default_locale) | ||
| BetterTogether::Event.find_each do |event| |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing privacy filtering for Communities, Events, and Conversations. The sitemap includes all Communities (line 12), Conversations (line 17), and Events (line 27) without filtering by privacy level. Since these models include the Privacy concern (Community and Event) or contain private data (Conversation), they should be filtered to only include public items:
- Line 12:
BetterTogether::Community.privacy_public.find_each - Line 17: Conversations should likely not be in the sitemap at all since they are private by nature
- Line 27:
BetterTogether::Event.privacy_public.find_each
Additionally, Posts should also filter by privacy: line 22 should be BetterTogether::Post.published.privacy_public.find_each
| BetterTogether::Community.find_each do |community| | |
| add helpers.community_path(community, locale: I18n.default_locale), lastmod: community.updated_at | |
| end | |
| add helpers.conversations_path(locale: I18n.default_locale) | |
| BetterTogether::Conversation.find_each do |conversation| | |
| add helpers.conversation_path(conversation, locale: I18n.default_locale), lastmod: conversation.updated_at | |
| end | |
| add helpers.posts_path(locale: I18n.default_locale) | |
| BetterTogether::Post.published.find_each do |post| | |
| add helpers.post_path(post, locale: I18n.default_locale), lastmod: post.updated_at | |
| end | |
| add helpers.events_path(locale: I18n.default_locale) | |
| BetterTogether::Event.find_each do |event| | |
| BetterTogether::Community.privacy_public.find_each do |community| | |
| add helpers.community_path(community, locale: I18n.default_locale), lastmod: community.updated_at | |
| end | |
| # Conversations are private and should not be included in the sitemap. | |
| add helpers.posts_path(locale: I18n.default_locale) | |
| BetterTogether::Post.published.privacy_public.find_each do |post| | |
| add helpers.post_path(post, locale: I18n.default_locale), lastmod: post.updated_at | |
| end | |
| add helpers.events_path(locale: I18n.default_locale) | |
| BetterTogether::Event.privacy_public.find_each do |event| |
| described_class.new.perform | ||
|
|
||
| expect(BetterTogether::Sitemap.current(host_platform).file).to be_attached | ||
| end | ||
|
|
||
| it 'includes only public pages in the sitemap' do | ||
| host_platform = create(:platform, :host) | ||
| public_page = create(:page, privacy: 'public', slug: 'public-page') | ||
| private_page = create(:page, privacy: 'private', slug: 'private-page') | ||
| BetterTogether::Sitemap.destroy_all | ||
|
|
||
| described_class.perform_now |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent job invocation methods between tests. Line 11 uses described_class.new.perform while line 22 uses described_class.perform_now. Use the same pattern in both tests for consistency. Prefer perform_now which is the standard RSpec pattern for testing ActiveJob jobs.
| get '/sitemap.xml.gz', to: 'sitemaps#show', as: :sitemap | ||
|
|
||
| scope ':locale', # rubocop:todo Metrics/BlockLength | ||
| locale: /#{I18n.available_locales.join('|')}/ do |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sitemap route is defined outside the :locale scope (line 6), which means it won't have access to locale-aware URL generation in the controller. However, the sitemap configuration file (config/sitemap.rb) generates locale-specific URLs using locale: I18n.default_locale for all paths. This could cause a mismatch between the sitemap route handling and the URLs it generates. Consider whether the sitemap should be locale-specific or if it should include URLs for all locales.
| get '/sitemap.xml.gz', to: 'sitemaps#show', as: :sitemap | |
| scope ':locale', # rubocop:todo Metrics/BlockLength | |
| locale: /#{I18n.available_locales.join('|')}/ do | |
| scope ':locale', # rubocop:todo Metrics/BlockLength | |
| locale: /#{I18n.available_locales.join('|')}/ do | |
| get '/sitemap.xml.gz', to: 'sitemaps#show', as: :sitemap |
| begin | ||
| Rake::Task['assets:precompile'].enhance do | ||
| Rake::Task['sitemap:refresh'].invoke | ||
| end | ||
| rescue RuntimeError | ||
| # assets:precompile may not be defined in some environments | ||
| end |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rake task hooks into assets:precompile (lines 24-26) which may not be appropriate for sitemap generation. Asset precompilation typically happens during deployment build phase, before the database is available. Sitemap generation requires database access to query Pages, Posts, Events, etc. This could cause deployment failures if the database isn't accessible during asset precompilation. Consider using a different hook (e.g., after deployment) or making this integration optional.
| begin | |
| Rake::Task['assets:precompile'].enhance do | |
| Rake::Task['sitemap:refresh'].invoke | |
| end | |
| rescue RuntimeError | |
| # assets:precompile may not be defined in some environments | |
| end | |
| # The automatic invocation of sitemap:refresh during assets:precompile has been removed. | |
| # To generate the sitemap, run `rake sitemap:refresh` after deployment when the database is available. |
| # frozen_string_literal: true | ||
|
|
||
| module BetterTogether | ||
| # Stores the generated sitemap in Active Storage for serving via S3 | ||
| class Sitemap < ApplicationRecord | ||
| belongs_to :platform | ||
|
|
||
| has_one_attached :file | ||
|
|
||
| validates :platform_id, uniqueness: true | ||
|
|
||
| def self.current(platform) | ||
| find_or_create_by!(platform: platform) | ||
| end | ||
| end | ||
| end |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing factory for the Sitemap model. According to the coding guidelines, "Every Better Together model needs a corresponding FactoryBot factory with proper engine namespace handling." Add a factory file at spec/factories/better_together/sitemaps.rb to support testing.
| <% end %> | ||
| <%= csrf_meta_tags %> | ||
| <%= csp_meta_tag %> | ||
| <link rel="sitemap" type="application/xml" href="<%= sitemap_path %>"> |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sitemap link tag is missing the locale parameter required by the engine's routing constraints. This will cause the link to fail to resolve properly. Update to:
<link rel="sitemap" type="application/xml" href="<%= sitemap_path(locale: I18n.locale) %>">| <link rel="sitemap" type="application/xml" href="<%= sitemap_path %>"> | |
| <link rel="sitemap" type="application/xml" href="<%= sitemap_path(locale: I18n.locale) %>"> |
| add helpers.communities_path(locale: I18n.default_locale) | ||
| BetterTogether::Community.find_each do |community| | ||
| add helpers.community_path(community, locale: I18n.default_locale), lastmod: community.updated_at | ||
| end | ||
|
|
||
| add helpers.conversations_path(locale: I18n.default_locale) | ||
| BetterTogether::Conversation.find_each do |conversation| | ||
| add helpers.conversation_path(conversation, locale: I18n.default_locale), lastmod: conversation.updated_at | ||
| end | ||
|
|
||
| add helpers.posts_path(locale: I18n.default_locale) | ||
| BetterTogether::Post.published.find_each do |post| | ||
| add helpers.post_path(post, locale: I18n.default_locale), lastmod: post.updated_at | ||
| end | ||
|
|
||
| add helpers.events_path(locale: I18n.default_locale) | ||
| BetterTogether::Event.find_each do |event| | ||
| add helpers.event_path(event, locale: I18n.default_locale), lastmod: event.updated_at | ||
| end |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing after_commit :refresh_sitemap callbacks for Post, Community, and Event models. Currently, only the Page model (line 60 in app/models/better_together/page.rb) triggers sitemap regeneration on changes. Since the sitemap includes Posts, Communities, and Events (config/sitemap.rb lines 22, 12, 27), these models should also trigger sitemap refresh when created, updated, or destroyed to keep the sitemap current.
| SitemapGenerator::Sitemap.public_path = Rails.root.join('tmp') | ||
| SitemapGenerator::Sitemap.sitemaps_path = '' | ||
|
|
||
| load Rails.root.join('config/sitemap.rb') |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rake task loads the sitemap configuration from the wrong location. Line 11 uses Rails.root.join('config/sitemap.rb') which looks for the file in the host application's config directory, but the sitemap.rb file is located in the engine's config directory (as shown in the diff). This should be BetterTogether::Engine.root.join('config/sitemap.rb') to correctly load the engine's sitemap configuration file.
| load Rails.root.join('config/sitemap.rb') | |
| load BetterTogether::Engine.root.join('config/sitemap.rb') |
| it 'includes only public pages in the sitemap' do | ||
| host_platform = create(:platform, :host) | ||
| public_page = create(:page, privacy: 'public', slug: 'public-page') | ||
| private_page = create(:page, privacy: 'private', slug: 'private-page') | ||
| BetterTogether::Sitemap.destroy_all | ||
|
|
||
| described_class.perform_now | ||
|
|
||
| data = BetterTogether::Sitemap.current(host_platform).file.download | ||
| xml = Zlib::GzipReader.new(StringIO.new(data)).read | ||
|
|
||
| expect(xml).to include(public_page.slug) | ||
| expect(xml).not_to include(private_page.slug) | ||
| end |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test for public posts privacy filtering. While the test on line 16 verifies that only public pages are included in the sitemap, there's no equivalent test to verify that private posts, communities, or events are excluded. Add tests to verify privacy filtering works correctly for all model types included in the sitemap (Posts, Communities, Events).
Summary
/sitemap.xml.gzthrough a controller that redirects to the stored filecreate_bt_tableand add host-platform associationTesting
bundle exec rubocop(fails: bundler: command not found: rubocop)bundle exec brakeman -q -w2(fails: bundler: command not found: brakeman)bundle exec bundler-audit --update(fails: bundler: command not found: bundler-audit)bin/codex_style_guard(fails: bundler: command not found: rubocop)DATABASE_URL=postgresql://postgres:postgres@localhost/community_engine_test bin/ci(fails: bundler: command not found: rails)https://chatgpt.com/codex/tasks/task_e_689b7bb083bc8321a51b456c1d6b090a