Skip to content

Pass request_headers and max_workers in recursive sitemap_to_df calls#411

Open
danishashko wants to merge 1 commit intoeliasdabbas:masterfrom
danishashko:fix/sitemap-recursive-params
Open

Pass request_headers and max_workers in recursive sitemap_to_df calls#411
danishashko wants to merge 1 commit intoeliasdabbas:masterfrom
danishashko:fix/sitemap-recursive-params

Conversation

@danishashko
Copy link
Copy Markdown

When sitemap_to_df recurses (both for robots.txt sitemap lists and for sitemap index files), request_headers and max_workers are silently dropped. This means custom headers (auth tokens, user-agent) and thread counts only apply to the top-level call.

This fix passes them through on both recursive call sites so they propagate consistently through the whole crawl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant