Skip to content

fix: rewrite east ayrshire council scraper for recollect api#2081

Open
InertiaUK wants to merge 1 commit into
robbrad:masterfrom
InertiaUK:fix/east-ayrshire-recollect-rewrite
Open

fix: rewrite east ayrshire council scraper for recollect api#2081
InertiaUK wants to merge 1 commit into
robbrad:masterfrom
InertiaUK:fix/east-ayrshire-recollect-rewrite

Conversation

@InertiaUK
Copy link
Copy Markdown
Contributor

@InertiaUK InertiaUK commented May 17, 2026

Summary

East Ayrshire Council migrated their bin collection lookup from a direct UPRN-based page to the ReCollect platform. The old scraper returned empty results because the council portal now requires session cookies to set an address.

This rewrites the scraper to hit the ReCollect address-suggest and events APIs directly:

  • Accepts postcode + house number (or a full address string)
  • Tries multiple resolution strategies: direct paon search, postcode search with qualifier disambiguation, combined query fallback
  • Strips trailing postcode from the house number field (handles resolve-v2 style full-address inputs)
  • No Selenium required - pure requests

Testing

  • Postcode + full address: KA1 3RB + '57 Whatriggs Road' - returns bins
  • Postcode only with fallback to first result: works when address resolves to parcels
  • Tested via API end-to-end on production wrapper

Summary by CodeRabbit

Release Notes

  • Improvements
    • Updated East Ayrshire Council bin collection lookup to accept postcode with house number or full address instead of UPRN
    • Enhanced collection date retrieval and address resolution for more reliable results

Review Change Stack

The council moved from a UPRN-based lookup on their own website to the
ReCollect platform. The old scraper returned empty results because the
council portal now requires a session cookie.

This rewrites the scraper to use the ReCollect address-suggest and
events APIs directly. Accepts postcode + house number (or full address
string). Falls back through multiple resolution strategies when the
first attempt returns no match.

Test address updated to KA1 3RB which resolves correctly on ReCollect.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e1841d2-44e7-40ea-b654-d008291740ec

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and 733f77d.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/EastAyrshireCouncil.py

📝 Walkthrough

Walkthrough

The EastAyrshireCouncil module is migrated from BeautifulSoup-based HTML parsing to Recollect API integration. The implementation resolves addresses via the address-suggest endpoint, retrieves bin collection events, and transforms them into a standardized output. Test configuration is updated to reflect the new API-driven approach.

Changes

Recollect API Migration

Layer / File(s) Summary
API setup and constants
uk_bin_collection/uk_bin_collection/councils/EastAyrshireCouncil.py
BeautifulSoup import removed; new module-level constants (HEADERS, AREA, SERVICE) introduced for Recollect API parameterization.
Address resolution via Recollect address-suggest
uk_bin_collection/uk_bin_collection/councils/EastAyrshireCouncil.py
parse_data now extracts address inputs from paon/number and postcode, performs multiple address-suggest API queries with fallback strategies, and resolves a place_id or raises ValueError if no match is found.
Events retrieval and bin transformation
uk_bin_collection/uk_bin_collection/councils/EastAyrshireCouncil.py
Events endpoint queries for ~60 days using date bounds, filters pickup-related events, transforms to bin objects, deduplicates, sorts by collection date, and returns {"bins": [...]} structure.
Test configuration update
uk_bin_collection/tests/input.json
EastAyrshireCouncil test entry now includes house_number and postcode fields, sets skip_get_url to true, and updates guidance to reference the new ReCollect API workflow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • #2005: This PR directly implements the Recollect API address-suggest and events endpoint logic suggested in the issue report for the East Ayrshire Council integration.

Suggested reviewers

  • dp247

Poem

🐰 From HTML's tangled roots we hop away,
To Recollect's clean API at play,
Addresses resolve in queries swift,
Bin dates collected—a thoughtful shift!
East Ayrshire's schedule now flows so bright,
With address and events aligned just right. 🚮✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: rewrite east ayrshire council scraper for recollect api' directly and clearly describes the main change: rewriting the East Ayrshire Council scraper to use the ReCollect API instead of the previous HTML scraping approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (733f77d).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2081   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant