fix: update NCAAB boxscore CSS selectors broken by sports-reference HTML changes#817
Open
seang1121 wants to merge 1 commit intoroclark:masterfrom
Open
fix: update NCAAB boxscore CSS selectors broken by sports-reference HTML changes#817seang1121 wants to merge 1 commit intoroclark:masterfrom
seang1121 wants to merge 1 commit intoroclark:masterfrom
Conversation
…TML changes sports-reference.com updated their page structure, breaking multiple NCAAB boxscore fields that returned None or empty results. Changes to BOXSCORE_SCHEME in sportsipy/ncaab/constants.py: - away_name/home_name: replaced deprecated a[itemprop="name"] with div#sb_team_0/1 strong a (sports-reference dropped itemprop attributes) - away_record/home_record: replaced div#boxes div[class="section_heading"] h2 with div#boxes h2 (section_heading wrapper no longer contains h2 elements) - away_ranking/home_ranking: replaced exact class attribute match with CSS class selector (div.game_summary.nohover.current tr) for robustness Verified against live 2024 NCAA Tournament game (Wagner vs North Carolina, March 21 2024) — all fields now return correct values. Fixes roclark#774 - NCAAB Boxscore returning blank for valid inputs Fixes roclark#757 - winning_abbr, winning_name, losing_abbr, losing_name not populating Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sports-reference.com updated their page HTML structure, breaking several NCAAB boxscore fields that silently returned
Noneor empty results.Fields fixed:
away_name/home_name—a[itemprop="name"]no longer exists on boxscore pages. Replaced withdiv#sb_team_0 strong a/div#sb_team_1 strong awhich reflects the currentscorebox_teamdiv structure.away_record/home_record—div[class="section_heading"] h2no longer contains text. Replaced withdiv#boxes h2(existing empty-string filter in_parse_recordhandles blank entries correctly).away_ranking/home_ranking— Changed from fragile exact class attribute match (div[class="game_summary nohover current"]) to CSS class selector (div.game_summary.nohover.current) for robustness against attribute ordering.Fixes
Verified against
Live 2024 NCAA Tournament game: Wagner (62) vs North Carolina (90), March 21 2024.
All name, record, score, and ranking fields return correct values after this change.
🤖 Generated with Claude Code