Open
Description
This has been happening since May 29. So, if we fix it ASAP, we won't have to make a backscraper to fill any possible gaps
It's loading the source page; but not the individual opinion pages. On standalone and local Juriscraper it's working fine, but it's blocked on the server...
python sample_caller.py -c juriscraper.opinions.united_states.state.okla --verbosity 3 -b
Checking on a server shell, it confirms the server IP is specifically blocked
In [1]: import requests
In [2]: r = requests.get("https://www.oscn.net/applications/oscn/deliverdocument.asp?citeid=548589")
Out[2]: <Response [403]>
In [3]: r.text
# below...
Sentry Issue: COURTLISTENER-9YJ
HTTPError: 403 Client Error: Forbidden for url: https://www.oscn.net/applications/oscn/deliverdocument.asp?citeid=548272
(1 additional frame(s) were not displayed)
...
File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 400, in handle
self.parse_and_scrape_site(mod, options)
File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 364, in parse_and_scrape_site
self.scrape_court(site, options["full_crawl"])
File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 261, in scrape_court
self.ingest_a_case(
File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 298, in ingest_a_case
content = get_binary_content(item["download_urls"], site)
File "cl/scrapers/utils.py", line 304, in get_binary_content
r.raise_for_status()
Metadata
Metadata
Assignees
Type
Projects
Status