Skip to content

Add timeouts to remote downloads#418

Open
massy-o wants to merge 2 commits into
szad670401:masterfrom
massy-o:codex/add-download-timeouts
Open

Add timeouts to remote downloads#418
massy-o wants to merge 2 commits into
szad670401:masterfrom
massy-o:codex/add-download-timeouts

Conversation

@massy-o

@massy-o massy-o commented May 14, 2026

Copy link
Copy Markdown

Summary

  • add explicit timeouts to remote model download requests
  • fail fast on HTTP errors before streaming response bodies
  • add a timeout when loading sample images from URLs

Why

The download helpers currently wait indefinitely if a remote server stalls. Adding bounded timeouts keeps CLI/model initialization paths from hanging on network sockets, while raise_for_status makes failed downloads explicit before writing partial content.

Testing

  • python3 -m py_compile Prj-Python/hyperlpr3/config/configuration.py Prj-Python/hyperlpr3/command/sample.py
  • git diff --check


def down_model_file(url, save_path):
resp = requests.get(url, stream=True)
resp = requests.get(url, stream=True, timeout=REQUEST_TIMEOUT_SECONDS)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review: The shared timeout constant keeps model file and zip downloads consistent, and raise_for_status prevents streaming an error page into the model cache.

@qianxiaoer

qianxiaoer commented May 14, 2026 via email

Copy link
Copy Markdown

@devloai devloai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary:

  • Adds timeout=30 (via a named constant) to both requests.get() calls in the model download helpers.
  • Adds raise_for_status() after each request to fail fast on HTTP errors before streaming begins.
  • Adds timeout=10 to urllib.request.urlopen() in the sample image loader.

Review Summary:

The changes are a solid step forward — bounded timeouts and explicit HTTP error checking are the right call for any network I/O path. Two issues worth addressing: (1) if a mid-stream ReadTimeout fires during down_model_file or down_model_zip, the already-opened output file is left as corrupt partial content on disk, which could cause silent failures on the next run (file exists but is invalid); a try/except that removes the partial file on failure is needed. (2) In url_to_image, the bare except Exception swallows the timeout silently and returns None with no diagnostic — now that timeouts are a real code path, at least a logger.warning is warranted, and urllib.error must be imported explicitly (not covered by import urllib alone).

Suggestions

  • Add cleanup of partial files on download failure by wrapping the streaming write in a try/except that deletes the file before re-raising. Apply
  • Replace import urllib with explicit import urllib.request and import urllib.error to make submodule dependencies clear and safe. Apply


def down_model_file(url, save_path):
resp = requests.get(url, stream=True)
resp = requests.get(url, stream=True, timeout=REQUEST_TIMEOUT_SECONDS)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout parameter in requests.get() sets both the connect timeout and the read timeout between chunks, not a total download time limit. For large model files over slow connections, 30 seconds between chunks is probably fine, but it's worth documenting this distinction — callers might expect a wall-clock cap.

More importantly, if the server accepts the connection and starts sending but stalls mid-transfer (e.g. after the first few chunks), the 30s read timeout will eventually fire and raise requests.exceptions.ReadTimeout, but the partial file at save_path will already be written and left on disk in a corrupt state. Consider wrapping the download in a try/except that cleans up the partial file on failure:

def down_model_file(url, save_path):
    resp = requests.get(url, stream=True, timeout=REQUEST_TIMEOUT_SECONDS)
    resp.raise_for_status()
    total = int(resp.headers.get('content-length', 0))
    try:
        with open(save_path, 'wb') as file, tqdm(...) as bar:
            for data in resp.iter_content(chunk_size=1024):
                size = file.write(data)
                bar.update(size)
    except Exception:
        if os.path.exists(save_path):
            os.remove(save_path)
        raise

Same applies to down_model_zip.

actions

Feedback: Rate this comment to help me improve future code reviews:

  • 👍 Good - Helpful and accurate
  • 👎 Poor - Wrong, unclear, or unhelpful
  • Skip if you don't have any strong opinions either way.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, accepted in 8323c2b by removing partially written model files if streaming fails before re-raising the download error.

resp = urllib.request.urlopen(url, timeout=10)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
except Exception as err:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad except Exception silently swallows the new urllib.error.URLError (raised on timeout) and returns None, giving the caller no indication of why it failed. Now that a timeout is a real code path, at least log a warning:

import urllib.error  # add at top of file alongside existing `import urllib`

def url_to_image(url):
    try:
        resp = urllib.request.urlopen(url, timeout=10)
        image = np.asarray(bytearray(resp.read()), dtype="uint8")
        image = cv2.imdecode(image, cv2.IMREAD_COLOR)
    except urllib.error.URLError as err:
        logger.warning(f"Failed to fetch image from URL ({err}): {url}")
        return None
    except Exception as err:
        logger.warning(f"Unexpected error reading image from URL: {err}")
        return None
    return image

Note: import urllib alone does not make urllib.error available — it must be imported explicitly.

actions

Feedback: Rate this comment to help me improve future code reviews:

  • 👍 Good - Helpful and accurate
  • 👎 Poor - Wrong, unclear, or unhelpful
  • Skip if you don't have any strong opinions either way.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, accepted in 8323c2b by importing urllib.request/urllib.error explicitly and logging URL fetch failures before returning None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants