Skip to content

feat: add postcode lookup for malvern hills district council#2074

Open
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:feat/malvern-hills-postcode-lookup
Open

feat: add postcode lookup for malvern hills district council#2074
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:feat/malvern-hills-postcode-lookup

Conversation

@InertiaUK
Copy link
Copy Markdown
Contributor

@InertiaUK InertiaUK commented May 13, 2026

Summary

  • Users can provide either UPRN or postcode + house number
  • UPRN takes priority when provided (backward compatible)
  • Uses the council's own address lookup API (sw2AddressLookupWS) for postcode-to-UPRN resolution
  • Matches by house number with fallback to single-result auto-select
  • Adds a null check on the results table for a clear error when the address is not in bin round records

The existing scraper wasn't broken — it works fine with a valid UPRN. This change removes the need for users to find and supply a UPRN manually, which is a common pain point (see #1497).

Testing

  • UPRN path (backward compat): WR14 1AA + UPRN 100120606212
  • Postcode + house number: WR14 1AA + paon 148
  • Tested via API end-to-end

Summary by CodeRabbit

  • New Features

    • MalvernHillsDC scraper now supports automatic UPRN resolution via address lookup. Users can provide postcode and house number, and the scraper will automatically resolve the matching UPRN.
  • Bug Fixes

    • Improved error handling with explicit validation when bin collection data is not available.

Review Change Stack

The scraper now resolves UPRNs from postcode + house number using
the council's own address lookup API, removing the need for users
to find and supply a UPRN manually. UPRN-only input still works.

Also adds a null check on the results table to give a clear error
when the address is not found in bin round records.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Warning

Review limit reached

@InertiaUK, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 20 minutes and 55 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 12d5afb2-24a5-47d1-8576-f31e1bc0e539

📥 Commits

Reviewing files that changed from the base of the PR and between a2f6624 and 271418f.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py
📝 Walkthrough

Walkthrough

MalvernHillsDC now supports resolving UPRN via postcode and house number through a council address lookup API. The scraper accepts postcode/house_number to derive UPRN or accepts UPRN directly. A new _resolve_uprn method queries the lookup endpoint, handles ambiguity via address matching, and integrates conditionally into the main parsing flow.

Changes

Malvern Hills UPRN Lookup Feature

Layer / File(s) Summary
Test configuration for postcode and house number inputs
uk_bin_collection/tests/input.json
Test input configuration updated to include postcode and house_number fields alongside uprn, and wiki_note revised to document that UPRN can be derived via the council's address lookup API or provided directly.
Address lookup and UPRN resolution method
uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py
New _resolve_uprn method performs HTTP GET to UPRN_LOOKUP_URL, handles empty results with ValueError, optionally disambiguates candidates using case-insensitive house_number prefix match against Address_Short field, and returns single UPRN when exactly one match is found or raises ValueError on multiple matches.
Parse data integration with UPRN resolution and error handling
uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py
parse_data now declares UPRN_LOOKUP_URL constant, conditionally calls _resolve_uprn when uprn is absent but postcode is provided, validates resolved UPRN with check_uprn, and raises ValueError when results table element is not found. Inline comment removed from thisCollection assignment.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A postcode and a house number fine,
Through lookup tables they align,
The UPRN appears with care,
When addresses match just there,
Now bins shall find their destined fare! ✨🗑️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the primary change—adding postcode lookup functionality to the Malvern Hills District Council scraper, enabling users to provide postcode and house number instead of manually looking up a UPRN.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py (2)

88-90: ⚡ Quick win

Align disambiguation error wording with the actual CLI argument.

The user-facing argument is paon (see kwargs.get("paon") on line 20 and the input.json change in this PR), but the error tells them to provide house_number, which they have no way to pass. Mention paon (or both) so the message is actionable.

✏️ Proposed wording tweak
-        raise ValueError(
-            f"Multiple addresses found for {postcode} — provide house_number to disambiguate"
-        )
+        raise ValueError(
+            f"Multiple addresses found for {postcode} — provide -p/--paon (house number) to disambiguate"
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py` around lines
88 - 90, The disambiguation error message in MalvernHillsDC.py currently tells
users to provide "house_number" which doesn't match the actual CLI/input
argument "paon" (see kwargs.get("paon")); update the ValueError message raised
where multiple addresses are found to reference "paon" (or both "paon" and
"house_number") so the instruction is actionable for users, e.g., mention
"provide paon (house number)" or "provide paon/house_number" in the exception
raised.

63-72: ⚡ Quick win

Tighten the signature and guard the outbound HTTP call.

Two small follow-ups on _resolve_uprn:

  • Line 63 trips Ruff RUF013 (PEP 484 forbids implicit Optional). Annotate house_number as Optional[str] (or str | None) and default to None.
  • Line 71 has no timeout= on requests.get, so a hung lookup server can block the scraper indefinitely. Adding a sensible timeout will fail fast and surface as a requests exception that the caller already understands.
♻️ Proposed diff
-from bs4 import BeautifulSoup
+from typing import Optional
+
+from bs4 import BeautifulSoup
@@
-    def _resolve_uprn(self, postcode: str, house_number: str = None) -> str:
+    def _resolve_uprn(self, postcode: str, house_number: Optional[str] = None) -> str:
         params = {
             "simple": "T",
             "pcode": postcode,
             "authority": "MHDC",
             "historical": "false",
             "hidedummyuprn": "1",
         }
-        response = requests.get(self.UPRN_LOOKUP_URL, params=params, verify=False)
+        response = requests.get(
+            self.UPRN_LOOKUP_URL, params=params, verify=False, timeout=30
+        )
         response.raise_for_status()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py` around lines
63 - 72, The _resolve_uprn function signature and HTTP call need tightening:
annotate house_number as Optional[str] (or use str | None) in the _resolve_uprn
signature to satisfy Ruff RUF013 and update any needed typing imports, and add a
sensible timeout parameter to the requests.get call (e.g., timeout=10) when
calling self.UPRN_LOOKUP_URL so the outbound lookup fails fast and raises a
requests exception instead of hanging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py`:
- Around line 78-83: The current logic in MalvernHillsDC.py uses
addr.startswith(house_lower) which can incorrectly match prefixes (e.g., "1"
matching "148 …"); change the comparison to check the first whitespace-delimited
token of entry["Address_Short"] against house_lower (or use a word-boundary
regex like rf"^{re.escape(house_lower)}\b") so only an exact house-number token
(not a prefix) returns entry["UPRN"]; keep using the same variables
(house_number/house_lower, results, entry.get("Address_Short"), entry["UPRN"])
and ensure you lower/strip the token before comparing.

---

Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py`:
- Around line 88-90: The disambiguation error message in MalvernHillsDC.py
currently tells users to provide "house_number" which doesn't match the actual
CLI/input argument "paon" (see kwargs.get("paon")); update the ValueError
message raised where multiple addresses are found to reference "paon" (or both
"paon" and "house_number") so the instruction is actionable for users, e.g.,
mention "provide paon (house number)" or "provide paon/house_number" in the
exception raised.
- Around line 63-72: The _resolve_uprn function signature and HTTP call need
tightening: annotate house_number as Optional[str] (or use str | None) in the
_resolve_uprn signature to satisfy Ruff RUF013 and update any needed typing
imports, and add a sensible timeout parameter to the requests.get call (e.g.,
timeout=10) when calling self.UPRN_LOOKUP_URL so the outbound lookup fails fast
and raises a requests exception instead of hanging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a6854851-1abe-4c42-aa01-9f626109621c

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and a2f6624.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py

Comment thread uk_bin_collection/uk_bin_collection/councils/MalvernHillsDC.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (271418f).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2074   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant