Skip to content

Tus implementation to resume transfer #540

Merged
ypriverol merged 26 commits intomainfrom
dev
Jan 16, 2026
Merged

Tus implementation to resume transfer #540
ypriverol merged 26 commits intomainfrom
dev

Conversation

@ypriverol
Copy link
Member

@ypriverol ypriverol commented Jan 15, 2026

Pull Request

Description

Brief description of the changes made in this PR.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Test addition/update
  • Updates to the dependencies has been done.

Summary by CodeRabbit

  • New Features
    • Resumable TUS uploads with pause/resume, persistent resume, server job lookup by filename, and FragPipe input support (plus UI example).
  • Bug Fixes
    • Improved upload error handling (resumable vs permanent), more reliable status polling, and enhanced progress/ETA reporting.
  • Chores
    • CI now uses dynamic Docker tagging and includes OCI image metadata.
  • UI
    • PRIDE Dataset button now visible.

✏️ Tip: You can customize this high-level summary in your review settings.

selvaebi and others added 20 commits January 14, 2026 22:32
Add FragPipe as an additional supported dataset format that can be loaded
from PRIDE and processed by the pmultiqc service.

Changes:
- Add FragPipe file detection (psm.tsv, ion.tsv) in detect_input_type()
- Add FragPipe files to filter_search_files() for PRIDE dataset filtering
- Add 'fragpipe' to allowed_input_types and --fragpipe-plugin handling
- Add PXD066146 as a FragPipe example in the PRIDE examples modal
…bpwS1

Refactor pmultiqc service to support FragPipe
Tus - transfer improvement using Tus framework
@ypriverol ypriverol requested a review from selvaebi January 15, 2026 18:49
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

📝 Walkthrough

Walkthrough

Adds TUS resumable uploads (server router, on_upload_complete, client tus-js integration), background archive extraction and FragPipe detection, Redis filename→job_id mapping and polling, MultiQC fragpipe wiring, dynamic Docker tagging/OCI labels in CI, a FastAPI dependency bump, and config/UI tweaks.

Changes

Cohort / File(s) Summary
CI: Docker Build & Push
/.github/workflows/build-and-push.yml
Add branch extraction, set DOCKER_TAG (latest for main/master, else branch), use ghcr.io/bigbio/pmultiqc:${{ env.DOCKER_TAG }}, and add OCI labels (org.opencontainers.image.source, org.opencontainers.image.revision, org.opencontainers.image.version).
Backend: TUS, Jobs & MultiQC
pmultiqc_service/app.py
Mount TUS router at /files, add handle_upload_complete, process_upload_in_background, GET /tus-job/{filename}, Redis filename→job_id mapping, background extraction/processing, FragPipe detection (detect_input_type/filter_search_files), and fragpipe support in run_pmultiqc_with_progress; adjust CORS, middleware, and static/template paths.
Frontend: Resumable Upload UI
pmultiqc_service/templates/index.html
Replace XHR upload with tus-js-client flow (resume, retries, progress/ETA), add Resume/Cancel controls and online-resume, fetch job_id by filename and polling, improved error handling and cleanup, and add FragPipe example UI.
Dependencies
pmultiqc_service/requirements.txt
Bump fastapi constraint and add tuspyserver>=4.2.0.
Kubernetes Config
pmultiqc_service/k8s/configmap.yaml
Set PRIDE_BUTTON_VISIBLE: "true"; minor whitespace/newline edits.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Frontend as Frontend (index.html)
    participant TUS as TUS Router
    participant Backend as FastAPI
    participant Redis
    participant Worker as Background Worker
    participant MultiQC

    Client->>Frontend: select file & start upload
    Frontend->>TUS: POST /files (initiate TUS upload)
    TUS-->>Frontend: Location (upload URL)

    loop upload chunks
        Frontend->>TUS: PATCH upload (resume-capable chunks)
        TUS-->>Frontend: Upload-Offset
    end

    TUS->>Backend: on_upload_complete callback
    Backend->>Redis: set filename → job_id (with TTL)
    Backend->>Worker: enqueue background processing

    Worker->>Worker: extract archive & detect input type
    Worker->>Redis: update job progress/status
    Worker->>MultiQC: run pmultiqc with plugins/args
    Worker->>Backend: persist results / mark complete

    Frontend->>Backend: GET /tus-job/{filename}
    Backend->>Redis: lookup job_id
    Backend-->>Frontend: job_id

    loop status polling
        Frontend->>Backend: GET /job-status/{job_id}
        Backend->>Redis: return updates
        Backend-->>Frontend: progress/status
    end

    Backend-->>Frontend: final status (complete/failure)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Review effort 4/5, Possible security concern

Poem

🐰 I nibble bytes and stitch each chunk,

I hop back on when networks dunk.
FragPipe crumbs and Redis threads,
MultiQC sings while progress treads.
Hooray — resumable uploads, yum!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Tus implementation to resume transfer' directly relates to the main change: adding TUS resumable upload support throughout the codebase (backend, frontend, workflow, dependencies).
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


🧹 Recent nitpick comments
pmultiqc_service/templates/index.html (2)

493-502: Consider scoping the milestone tracking variable.

window.lastLoggedPercent is a global that persists across uploads. While this only affects logging (not functionality), it could cause slightly confusing log output if a user performs multiple uploads without page refresh.

♻️ Optional: Reset in onSuccess or use local scope
                onSuccess: function() {
                    console.log('✅ TUS upload completed successfully');
+                   window.lastLoggedPercent = -1;  // Reset for next upload
                    document.getElementById('cancelUploadBtn').style.display = 'none';

574-609: Consider adding cancellation support to prevent stale callbacks.

The recursive polling doesn't have a cancellation mechanism. If a user starts a new upload before the polling completes (unlikely but possible), stale callbacks could update currentJobId incorrectly.

♻️ Suggested: Add cancellation check
+       let fetchJobIdAborted = false;  // Add at module scope (near line 218)
+
        // Fetch job_id by filename after TUS upload completes
-       function fetchJobIdByFilename(filename, attempts = 0) {
+       function fetchJobIdByFilename(filename, attempts = 0, uploadRef = currentUploadXhr) {
+           // Abort if upload was cancelled or a new upload started
+           if (fetchJobIdAborted || currentUploadXhr !== uploadRef) {
+               console.log('🛑 Job ID fetch aborted (upload changed)');
+               return;
+           }
            if (attempts > 10) {

Then set fetchJobIdAborted = true in cancelUpload() and reset it in uploadFiles().


📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6dfe8ca and f703118.

📒 Files selected for processing (1)
  • pmultiqc_service/templates/index.html
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: test_lfq
  • GitHub Check: test_tmt
  • GitHub Check: test_mzid_mzML
🔇 Additional comments (11)
pmultiqc_service/templates/index.html (11)

217-218: LGTM! State management properly addresses previous review concerns.

The isUploadPaused flag correctly replaces the invalid upload.isUploading property check flagged in past reviews. The resumeUpload function safely validates currentUploadXhr before calling start().

Also applies to: 268-287


289-303: LGTM! Proper cleanup on upload cancellation.

The function correctly clears state, removes event listeners via cleanupUploadListeners(), and hides UI elements. This addresses the memory leak concern from previous reviews.


376-396: LGTM! Well-configured TUS upload with resumability support.

The configuration includes:

  • Extensive retry delays with exponential backoff
  • Reasonable 50MB chunk size for large files
  • Custom fingerprint enabling proper resume identification
  • removeFingerprintOnSuccess: true to clean up after completion

397-439: LGTM! Error handling correctly distinguishes resumable vs permanent failures.

The network error detection logic (checking for missing response or status 0/undefined/null) is appropriate. Network errors preserve the upload object for resume capability while permanent failures properly clean up state.


504-547: LGTM! Comprehensive cleanup in onSuccess.

The handler properly:

  • Cleans up event listeners (addressing memory leak concern)
  • Removes TUS localStorage entries to prevent stale resume attempts
  • Transitions UI to processing state with appropriate delay for server-side processing

553-572: LGTM! Online handler correctly implements auto-resume.

The handler addresses previous review feedback by:

  • Using isUploadPaused instead of the non-existent upload.isUploading
  • Storing handler reference for cleanup via cleanupUploadListeners()
  • Double-checking both upload reference equality and paused state

665-669: LGTM! Status deduplication reduces log noise.

Logging only on status transitions is a sensible improvement for long-running processing jobs.


814-832: LGTM! XSS protection and cleanup utilities properly implemented.

escapeHtml addresses the previous XSS concern by escaping HTML special characters. cleanupUploadListeners properly removes and nullifies the event handler to prevent memory leaks.


834-864: LGTM! Error display properly sanitizes user-controlled content.

The escapeHtml(message) call addresses the XSS vulnerability flagged in previous reviews. Button HTML is static and safe. The differentiated cleanup paths for resumable vs non-resumable errors maintain correct state.


1037-1062: LGTM! FragPipe example follows existing pattern.

The new example card is consistent with the others in structure and styling, providing users with a useful FragPipe/MSFragger dataset reference.


9-10: Verify the SRI hash is correct for tus-js-client@4.3.1.

The version is correctly pinned to 4.3.1, addressing the previous review comment. However, the SRI hash should be verified to ensure it matches the actual CDN resource at https://cdn.jsdelivr.net/npm/tus-js-client@4.3.1/dist/tus.min.js. An incorrect hash would cause the script to fail to load silently, breaking the upload functionality.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@pmultiqc_service/app.py`:
- Around line 499-530: Remove the redundant Redis write that uses
TUS_JOB_FILENAME_PREFIX and TUS_JOB_FILENAME_TTL_SECONDS: delete the first block
that calls get_redis_client(), constructs key =
f"{TUS_JOB_FILENAME_PREFIX}{filename}", and calls redis_client.setex(...
job_id.encode("utf-8")). Keep the later Redis mapping that uses key =
f"pmultiqc:tus_job:{filename}" (the endpoint relies on this). Also remove the
now-unused constants TUS_JOB_FILENAME_PREFIX and TUS_JOB_FILENAME_TTL_SECONDS
from the top-level constants so there are no dead variables.

In `@pmultiqc_service/requirements.txt`:
- Line 1: Update the FastAPI dependency line in requirements.txt from
"fastapi>=0.110.0,<0.111.0" to a version that allows Starlette >=0.40.0 (for
example "fastapi>=0.128.0,<0.129.0" or a newer patch range); then reinstall or
regenerate your lockfile (pip-compile, pipenv lock, or poetry lock as
applicable) and run the test suite to ensure compatibility (check for any
import/API changes after upgrading FastAPI in your codebase).

In `@pmultiqc_service/templates/index.html`:
- Around line 541-554: The online event listener (onlineHandler) is added but
never removed, causing a memory leak; assign the handler to a property on the
upload object (e.g., upload._onlineHandler = onlineHandler) right after creating
it and adding it with window.addEventListener('online', onlineHandler), and then
remove it (window.removeEventListener('online', upload._onlineHandler)) from the
upload cleanup paths such as the upload.onSuccess, upload.onError
(non-resumable), and any cancel/finish handlers so the listener is always
detached when the upload lifecycle ends.
- Around line 800-821: The showError function is inserting the message into
resultDiv.innerHTML without escaping, allowing XSS when message contains
user-controlled content; update showError to sanitize/escape HTML in the message
before building errorHtml (or use textContent for the message node) so any
filenames or user data are rendered as plain text, then append the Resume and
Return buttons via DOM methods (createElement, setAttribute, addEventListener)
or insert their HTML safely after the escaped text; ensure you still clear
currentUploadXhr and toggle progressContainer/cancelUploadBtn/resumeUploadBtn as
before.
- Around line 9-10: Replace the floating CDN reference to tus-js-client with a
pinned release to avoid unexpected upgrades: update the <script> tag that loads
"tus-js-client" so its src uses a specific version (e.g., change `@latest` to
`@4.3.1` so the URL becomes .../tus-js-client@4.3.1/dist/tus.min.js); ensure the
script tag that references the library is the only place changed and consider
adding SRI/hash later if required for additional security.
🧹 Nitpick comments (6)
.github/workflows/build-and-push.yml (2)

32-44: Dynamic tagging logic looks good, but consider branch name sanitization.

The logic correctly tags main/master as latest and uses branch names for feature branches. However, branch names with characters invalid for Docker tags (e.g., /, #) could cause build failures.

Consider sanitizing the branch name:

♻️ Optional: Sanitize branch name for Docker tag compatibility
      - name: Determine Docker tag
        id: tag
        shell: bash
        run: |
          if [[ "${{ env.BRANCH_NAME }}" == "main" ]] || [[ "${{ env.BRANCH_NAME }}" == "master" ]]; then
            echo "DOCKER_TAG=latest" >> $GITHUB_ENV
          else
-           echo "DOCKER_TAG=${{ env.BRANCH_NAME }}" >> $GITHUB_ENV
+           # Sanitize branch name for Docker tag (replace invalid chars with -)
+           SANITIZED=$(echo "${{ env.BRANCH_NAME }}" | sed 's/[^a-zA-Z0-9._-]/-/g')
+           echo "DOCKER_TAG=${SANITIZED}" >> $GITHUB_ENV
          fi

5-5: Temporary tus branch trigger should be removed after merge.

The tus branch trigger is useful for testing during development but should be removed once the feature is merged to main to avoid publishing unintended images.

pmultiqc_service/app.py (4)

444-446: Unused variable filetype.

The variable filetype is assigned but never used. As flagged by static analysis, this should be removed or used.

♻️ Proposed fix
        # Extract metadata
        filename = metadata.get("filename", "upload.zip")
-       filetype = metadata.get("filetype", "application/zip")

478-486: Unused variable upload_dir and cleanup logic issue.

The variable upload_dir is assigned but never used. Additionally, the cleanup logic attempts to remove .info file but file_path points to the uploaded file itself (which was already moved), not the TUS upload directory structure.

♻️ Proposed fix: Remove unused variable or fix cleanup logic
        # Clean up TUS upload directory to prevent re-upload
        # The file_path is like /tmp/pmultiqc_uploads/{upload_id}
        # We need to delete the .info file and directory
        try:
-           upload_dir = os.path.dirname(file_path)
            info_file = file_path + ".info"
            if os.path.exists(info_file):
                os.remove(info_file)
                logger.info(f"Removed TUS metadata file: {info_file}")
            # Note: The upload file itself is already moved, so no need to delete it
        except Exception as e:
            logger.warning(f"Failed to clean up TUS metadata: {e}")

2851-2877: Exception handling should use raise ... from for proper chaining.

Per static analysis (B904), when re-raising exceptions within an except block, use raise ... from err or raise ... from None to preserve the exception chain.

♻️ Proposed fix
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Error retrieving TUS job mapping: {e}")
-       raise HTTPException(status_code=500, detail="Error retrieving job information")
+       raise HTTPException(status_code=500, detail="Error retrieving job information") from e

388-431: Background processing function handles errors appropriately.

The process_upload_in_background function:

  • Creates extraction directory
  • Validates and extracts ZIP
  • Detects input type
  • Updates job status on failure
  • Spawns processing thread

However, per static analysis, using logging.exception instead of logging.error would include the stack trace automatically.

♻️ Optional improvement for better error logging
    except Exception as e:
-       logger.error(f"Background processing failed for job {job_id}: {e}")
+       logger.exception(f"Background processing failed for job {job_id}")
        update_job_progress(
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1938816 and a44a62e.

📒 Files selected for processing (5)
  • .github/workflows/build-and-push.yml
  • pmultiqc_service/app.py
  • pmultiqc_service/k8s/configmap.yaml
  • pmultiqc_service/requirements.txt
  • pmultiqc_service/templates/index.html
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-31T10:21:32.605Z
Learnt from: cbielow
Repo: bigbio/pmultiqc PR: 414
File: pmultiqc_service/app.py:1992-1992
Timestamp: 2025-10-31T10:21:32.605Z
Learning: In pmultiqc_service/app.py, when rendering templates (especially index.html), all endpoints must pass the same config structure to maintain consistency. The config dict should include both BASE_URL and PRIDE_BUTTON_VISIBLE: {"request": request, "config": {"BASE_URL": BASE_URL, "PRIDE_BUTTON_VISIBLE": PRIDE_BUTTON_VISIBLE}}. This applies to the index() endpoint and all paths in submit_pride() that render index.html.

Applied to files:

  • pmultiqc_service/k8s/configmap.yaml
🪛 OSV Scanner (2.3.1)
pmultiqc_service/requirements.txt

[HIGH] 1-1: starlette 0.37.2: Starlette has possible denial-of-service vector when parsing large files in multipart forms

(GHSA-2c2j-9gv5-cj73)


[HIGH] 1-1: starlette 0.37.2: Starlette Denial of service (DoS) via multipart/form-data

(GHSA-f96h-pmfr-66vw)

🪛 Ruff (0.14.11)
pmultiqc_service/app.py

424-424: Do not catch blind exception: Exception

(BLE001)


425-425: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


445-445: Local variable filetype is assigned to but never used

Remove assignment to unused variable filetype

(F841)


455-455: Abstract raise to an inner function

(TRY301)


455-455: Avoid specifying long messages outside the exception class

(TRY003)


479-479: Local variable upload_dir is assigned to but never used

Remove assignment to unused variable upload_dir

(F841)


485-485: Do not catch blind exception: Exception

(BLE001)


506-506: Do not catch blind exception: Exception

(BLE001)


529-529: Do not catch blind exception: Exception

(BLE001)


533-533: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


534-534: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


2860-2860: Abstract raise to an inner function

(TRY301)


2872-2872: Abstract raise to an inner function

(TRY301)


2875-2875: Do not catch blind exception: Exception

(BLE001)


2876-2876: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


2877-2877: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (11)
pmultiqc_service/k8s/configmap.yaml (1)

14-18: Configuration change looks appropriate.

Enabling PRIDE_BUTTON_VISIBLE aligns with the new FragPipe workflow support and PRIDE examples added in this PR. The configuration structure follows the established pattern per the learnings.

Minor: Line 18 has trailing whitespace after the "0" value.

pmultiqc_service/templates/index.html (3)

371-390: TUS configuration looks well-designed for resumable uploads.

The configuration includes:

  • Extensive retry delays (up to 60s) for resilience
  • 50MB chunk size appropriate for large files
  • Fingerprint-based resume capability
  • Proper metadata passing

The implementation correctly handles the TUS protocol requirements.


560-594: Job ID polling implementation is reasonable.

The retry logic (10 attempts, 1s intervals) gives adequate time for the backend to process the TUS upload completion callback before failing. The logging at first and last attempts helps with debugging without flooding the console.


1002-1027: FragPipe example addition looks good.

The new FragPipe example (PXD066146) is consistent with the other example entries in styling and structure, and correctly describes the expected file types (psm.tsv, ion.tsv).

pmultiqc_service/app.py (6)

538-576: TUS Location header middleware is well-implemented.

The middleware correctly handles:

  • Relative paths (/files/...)
  • Absolute URLs with different hosts
  • Proper BASE_URL prepending for ingress scenarios
  • Logging for debugging

This is essential for TUS to work correctly behind a reverse proxy/ingress.


1684-1690: FragPipe detection logic looks correct.

The detection checks for psm.tsv and ion.tsv files which are distinctive FragPipe outputs. The placement after MaxQuant and before DIANN is appropriate since these are more specific file patterns.


698-709: FragPipe file detection in PRIDE file filtering.

The addition of psm.tsv and ion.tsv detection in filter_search_files allows FragPipe datasets from PRIDE to be properly identified and downloaded. The exact filename matching (filename_lower == "psm.tsv") is appropriate since these are standard FragPipe output names.


579-590: TUS router mounting configuration looks appropriate.

The TUS router is mounted at /files with:

  • max_size matching MAX_FILE_SIZE (10GB default)
  • on_upload_complete callback for post-upload processing
  • 7-day auto-cleanup for incomplete uploads

The configuration aligns with the frontend TUS client setup.


382-385: CORS configuration correctly updated for TUS protocol.

TUS requires:

  • PATCH and HEAD methods (added)
  • Custom headers like Upload-Offset, Tus-Resumable (covered by *)
  • Exposed headers for client to read response headers

The wildcard headers are acceptable here given the TUS protocol requirements, though in production you may want to explicitly list only the required TUS headers for defense-in-depth.


1794-1796: FragPipe plugin integration in MultiQC command is valid.

The --fragpipe-plugin argument is a supported pmultiqc command-line flag defined in pmultiqc/cli.py and properly integrated with the FragPipeModule implementation. The argument is documented in usage examples and has comprehensive test coverage.

pmultiqc_service/requirements.txt (1)

15-15: Confirmed: tuspyserver>=4.2.0 is stable and API-compatible.

The create_tus_router API and all parameters used in app.py (prefix, files_dir, max_size, on_upload_complete, days_to_keep) are consistent with the documented tuspyserver API. Version 4.2.0 is a stable release with the latest patch 4.2.3 available (Nov 2025). The callback signature handle_upload_complete(file_path: str, metadata: dict) matches the expected interface. No compatibility issues.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@@ -1,4 +1,4 @@
fastapi>=0.104.0,<0.105.0
fastapi>=0.110.0,<0.111.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

FastAPI 0.110.0 starlette dependency version

💡 Result:

FastAPI 0.110.0 pins Starlette to ">=0.37.2,<0.38.0". [1][2]

Sources:

  • FastAPI release notes. [1]
  • FastAPI GitHub / pyproject dependency listing (discussion). [2]

🌐 Web query:

GHSA-2c2j-9gv5-cj73 starlette vulnerability

💡 Result:

Summary: GHSA-2c2j-9gv5-cj73 is a moderate-severity denial-of-service issue in Starlette (CVE-2025-54121). Parsing multipart forms with large files (over the default spool size) can block the main/event thread when rolling an upload to disk, preventing new connections. Affected versions: < 0.47.2 (notably 0.47.0 and 0.47.1). Fixed in 0.47.2 — upgrade to 0.47.2 or later. (github.com)

Citations:


🌐 Web query:

GHSA-f96h-pmfr-66vw starlette vulnerability multipart

💡 Result:

Summary — what GHSA-f96h-pmfr-66vw is

  • A high-severity DoS in Starlette (published Oct 15, 2024): multipart/form-data parts that lack a filename are treated as text fields and buffered with no per-part size limit, allowing attackers to exhaust memory. [1][2]

Affected / fixed

  • Affected: starlette versions < 0.40.0.
  • Fixed in: starlette 0.40.0 (adds max_part_size to MultiPartParser). [1][3]

Immediate action

  1. Upgrade any Starlette (or FastAPI) services to >= 0.40.0. [1][3]
  2. As defense-in-depth, enforce request and part size limits at the reverse proxy (e.g., nginx, cloud WAF) and validate multipart input server-side. [2][3]

References
[1] GitHub Advisory GHSA-f96h-pmfr-66vw (Starlette) — published Oct 15, 2024.
[2] Tenable / CVE-2024-47874 summary.
[3] Starlette commit / changelog fixing multipart (adds max_part_size).


🌐 Web query:

latest stable FastAPI version 2025

💡 Result:

The latest stable FastAPI release in 2025 was 0.128.0, published December 27, 2025. [1][2]

Sources:

  • PyPI release page for fastapi. [1]
  • Safety DB listing (notes latest v0.128.0). [2]

🌐 Web query:

FastAPI 0.128.0 starlette dependency version

💡 Result:

FastAPI 0.128.0 requires Starlette in the 0.xx series and (per packaging updates) allows Starlette 0.51 — i.e., compatible with Starlette <0.52. [1][2]

Sources:

  • PyPI release metadata for fastapi 0.128.0. [1]
  • Fedora package changelog (notes: "Allow Starlette 0.51"). [2]

Security: Starlette 0.37.2 has a HIGH-severity memory exhaustion vulnerability (GHSA-f96h-pmfr-66vw).

FastAPI 0.110.x pins Starlette to 0.37.2, which is vulnerable to CVE-2024-47874 (GHSA-f96h-pmfr-66vw): multipart form fields without filenames are buffered with no per-part size limit, allowing attackers to exhaust memory. Since this service handles file uploads, this is a critical concern.

Upgrade FastAPI to a much newer version that allows Starlette >= 0.40.0 (e.g., FastAPI 0.128.0+), which includes the patch.

🧰 Tools
🪛 OSV Scanner (2.3.1)

[HIGH] 1-1: starlette 0.37.2: Starlette has possible denial-of-service vector when parsing large files in multipart forms

(GHSA-2c2j-9gv5-cj73)


[HIGH] 1-1: starlette 0.37.2: Starlette Denial of service (DoS) via multipart/form-data

(GHSA-f96h-pmfr-66vw)

🤖 Prompt for AI Agents
In `@pmultiqc_service/requirements.txt` at line 1, Update the FastAPI dependency
line in requirements.txt from "fastapi>=0.110.0,<0.111.0" to a version that
allows Starlette >=0.40.0 (for example "fastapi>=0.128.0,<0.129.0" or a newer
patch range); then reinstall or regenerate your lockfile (pip-compile, pipenv
lock, or poetry lock as applicable) and run the test suite to ensure
compatibility (check for any import/API changes after upgrading FastAPI in your
codebase).

Remove tus branch build action in github actions
@codecov
Copy link

codecov bot commented Jan 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 8.63%. Comparing base (5f4da46) to head (a5ab34a).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##            main    #540      +/-   ##
========================================
+ Coverage   6.83%   8.63%   +1.80%     
========================================
  Files         44      44              
  Lines       5680    5793     +113     
========================================
+ Hits         388     500     +112     
- Misses      5292    5293       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/build-and-push.yml:
- Around line 36-44: The "Determine Docker tag" step contains an unreachable
else branch and an unused step id; either remove the else block and delete "id:
tag" (since you only trigger on main/master and you write DOCKER_TAG to
GITHUB_ENV), or if you intend to support other branches, expand the workflow
trigger to include branch patterns and replace the branch-to-tag logic with a
sanitized tag (e.g., set DOCKER_TAG by taking env.BRANCH_NAME and replacing '/'
with '-' before writing to GITHUB_ENV) and keep or use an id only if you need
step outputs.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a44a62e and 86da4b9.

📒 Files selected for processing (1)
  • .github/workflows/build-and-push.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: test_maxquant_dia
  • GitHub Check: test_dia
  • GitHub Check: test_diann
  • GitHub Check: test_tmt
  • GitHub Check: test_lfq
  • GitHub Check: test_maxquant
  • GitHub Check: test_proteobench
  • GitHub Check: test_mzid_mzML
  • GitHub Check: test_fragpipe
  • GitHub Check: test_mzid_mgf
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (2)
.github/workflows/build-and-push.yml (2)

32-34: LGTM!

Standard and correct approach for extracting the branch name from GITHUB_REF.


46-59: LGTM!

Good additions:

  • Dynamic tagging via environment variable is correctly wired.
  • OCI labels (org.opencontainers.image.*) follow best practices and provide useful traceability metadata (source repo, commit SHA, version).

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +36 to +44
- name: Determine Docker tag
id: tag
shell: bash
run: |
if [[ "${{ env.BRANCH_NAME }}" == "main" ]] || [[ "${{ env.BRANCH_NAME }}" == "master" ]]; then
echo "DOCKER_TAG=latest" >> $GITHUB_ENV
else
echo "DOCKER_TAG=${{ env.BRANCH_NAME }}" >> $GITHUB_ENV
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable else branch and unused step ID.

Two observations:

  1. The workflow only triggers on main and master branches (line 5), so the else branch (lines 42-44) will never execute. If this is intentional, consider removing the dead code; otherwise, expand the trigger to include other branches.

  2. The id: tag is set but never referenced. Since outputs go to GITHUB_ENV, the id can be removed.

🔧 Suggested simplification (if only main/master is intended)
       - name: Determine Docker tag
-        id: tag
         shell: bash
         run: |
-          if [[ "${{ env.BRANCH_NAME }}" == "main" ]] || [[ "${{ env.BRANCH_NAME }}" == "master" ]]; then
-            echo "DOCKER_TAG=latest" >> $GITHUB_ENV
-          else
-            echo "DOCKER_TAG=${{ env.BRANCH_NAME }}" >> $GITHUB_ENV
-          fi
+          echo "DOCKER_TAG=latest" >> $GITHUB_ENV

Alternatively, if you intend to support feature branches in the future, consider sanitizing branch names (replacing / with -) since Docker tags cannot contain /.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Determine Docker tag
id: tag
shell: bash
run: |
if [[ "${{ env.BRANCH_NAME }}" == "main" ]] || [[ "${{ env.BRANCH_NAME }}" == "master" ]]; then
echo "DOCKER_TAG=latest" >> $GITHUB_ENV
else
echo "DOCKER_TAG=${{ env.BRANCH_NAME }}" >> $GITHUB_ENV
fi
- name: Determine Docker tag
shell: bash
run: |
echo "DOCKER_TAG=latest" >> $GITHUB_ENV
🤖 Prompt for AI Agents
In @.github/workflows/build-and-push.yml around lines 36 - 44, The "Determine
Docker tag" step contains an unreachable else branch and an unused step id;
either remove the else block and delete "id: tag" (since you only trigger on
main/master and you write DOCKER_TAG to GITHUB_ENV), or if you intend to support
other branches, expand the workflow trigger to include branch patterns and
replace the branch-to-tag logic with a sanitized tag (e.g., set DOCKER_TAG by
taking env.BRANCH_NAME and replacing '/' with '-' before writing to GITHUB_ENV)
and keep or use an id only if you need step outputs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@pmultiqc_service/templates/index.html`:
- Around line 547-560: The onlineHandler uses a non-existent upload.isUploading
property so the condition can be true incorrectly and trigger duplicate resume
calls; replace this check by tracking upload status yourself (e.g., introduce a
boolean flag like isPausedByUser or isUploadRunning) and update that flag when
you call upload.start() or upload.abort(), then change the onlineHandler
condition (which references currentUploadXhr and upload) to consult your flag
instead of upload.isUploading so resume is only attempted when the upload is
actually paused by the user.
- Around line 9-10: Update the embedded tus-js-client from v4.2.0 to v4.3.1 by
changing the script tag that references the library (the <script
src="...tus-js-client@4.2.0..."> element) to point to the 4.3.1 CDN URL and
replace the current integrity attribute value with the correct SRI hash for
v4.3.1; ensure the crossorigin attribute remains present if needed and test
uploads to confirm IETF draft and Node Buffer behavior improvements work as
expected.
🧹 Nitpick comments (1)
pmultiqc_service/templates/index.html (1)

1038-1038: Minor: Redundant CSS shorthand value.

Static analysis flagged "0 0 5px 0" as having a redundant fourth value. When the left margin equals the right margin (both 0), the shorthand can be simplified to "0 0 5px".

Note: This same pattern appears in other example blocks as well (lines 930, 957, 984, 1011), so you may want to address them all together for consistency.

♻️ Suggested fix
-                            <h4 style="margin: 0 0 5px 0; color: `#333`;">PXD066146 - FragPipe Analysis</h4>
+                            <h4 style="margin: 0 0 5px; color: `#333`;">PXD066146 - FragPipe Analysis</h4>
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a446e3 and a5ab34a.

📒 Files selected for processing (1)
  • pmultiqc_service/templates/index.html
🧰 Additional context used
🪛 GitHub Check: CodeFactor
pmultiqc_service/templates/index.html

[warning] 1038-1038: pmultiqc_service/templates/index.html#L1038
Expected "0 0 5px 0" to be "0 0 5px". (shorthand-property-no-redundant-values)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: test_mzid_mgf
  • GitHub Check: test_lfq
  • GitHub Check: test_proteobench
  • GitHub Check: test_mzid_mzML
  • GitHub Check: test_tmt
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (6)
pmultiqc_service/templates/index.html (6)

111-114: LGTM!

The Resume and Cancel buttons are properly structured with appropriate initial hidden state and onclick handlers.


267-299: LGTM!

The resumeUpload and cancelUpload functions correctly handle the TUS upload lifecycle. Good to see cleanupUploadListeners() being called in cancelUpload to prevent memory leaks.


393-434: Well-designed error classification.

The error handler properly distinguishes between network errors (resumable) and permanent failures. Network errors preserve the upload instance for manual resume, while permanent failures clean up resources appropriately.


566-601: LGTM!

The polling implementation is well-designed with proper URL encoding, bounded retry attempts, and reduced logging noise. The error handling gracefully retries on "not ready" responses.


806-824: LGTM!

Both utility functions are well-implemented. escapeHtml provides comprehensive XSS protection by escaping all necessary HTML entities, and cleanupUploadListeners properly removes event listeners to prevent memory leaks. These address concerns from previous reviews.


826-856: XSS vulnerability addressed.

The escapeHtml(message) call properly sanitizes user-controlled content before inserting into innerHTML. The conditional logic for showing/hiding the Resume button based on error type is well-implemented.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@ypriverol ypriverol merged commit e6053ac into main Jan 16, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants