Skip to content

Conversation

@emarondan
Copy link

This change introduces FromSetting, a marker that allows spider parameters to declare defaults pulled directly from crawler.settings.
Defaults are resolved in _set_crawler following this precedence:

  • Values passed via CLI/kwargs (-a)
  • Literal defaults defined in the Pydantic model
  • Values from Scrapy settings using FromSetting (with the chosen getter)
  • Explicit default provided in FromSetting if the setting is missing
  • Validation error if none of the above apply

Unit tests were added to ensure correct precedence and proper integration with Scrapy settings.

raise
super().__init__(*args, **kwargs)

def _set_crawler(self, crawler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, where is this called (normally)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh.

I think this should override from_crawler() instead, though I'm not 100% sure if it's possible.

@wRAR
Copy link
Contributor

wRAR commented Oct 1, 2025

Regarding test failures, it's an incompatibility with Scrapy 2.13 which I haven't thought of checking, either adding install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor") to e.g. conftest.py will be enough or the tests need to be changed to actually call and await crawler.crawl() (I haven't looked at the code).

And it should be unrelated to the changes of this PR.

@emarondan
Copy link
Author

I added a conftest.py to make it pass the tests locally. The file content is:

import os

os.environ.setdefault(
    "TWISTED_REACTOR",
    "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
)

try:
    from twisted.internet import asyncioreactor
    asyncioreactor.install()
except Exception:
    pass

Should I add this change to the PR?

@wRAR
Copy link
Contributor

wRAR commented Oct 1, 2025

Yes please.

@wRAR
Copy link
Contributor

wRAR commented Oct 1, 2025

Though I don't think touching the TWISTED_REACTOR envvar makes sense, Scrapy should only operate on settings (I doubt it even reads this envvar, though I won't be surprised that it does via some deprecated mechanism).

@codecov
Copy link

codecov bot commented Oct 1, 2025

Codecov Report

❌ Patch coverage is 76.47059% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.45%. Comparing base (63b3991) to head (784bf7e).

Files with missing lines Patch % Lines
scrapy_spider_metadata/_params.py 70.37% 5 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #25      +/-   ##
==========================================
- Coverage   85.71%   83.45%   -2.27%     
==========================================
  Files           4        5       +1     
  Lines         105      139      +34     
  Branches       19       25       +6     
==========================================
+ Hits           90      116      +26     
- Misses         11       16       +5     
- Partials        4        7       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants