Skip to content

Add file type validation#13802

Open
spider-yamet wants to merge 3 commits intoinfiniflow:mainfrom
spider-yamet:fix/validation-file-type
Open

Add file type validation#13802
spider-yamet wants to merge 3 commits intoinfiniflow:mainfrom
spider-yamet:fix/validation-file-type

Conversation

@spider-yamet
Copy link
Copy Markdown
Contributor

@spider-yamet spider-yamet commented Mar 26, 2026

What problem does this PR solve?

This PR fixes WebDAV sync behavior for unsupported file types (#13795).

Previously, the WebDAV connector selected files primarily by modified time (and size threshold) and could still pass unsupported extensions into the download/document-generation path. This caused unnecessary processing and inconsistent behavior compared with connectors that validate file type earlier.

This change adds extension validation in two places:

  1. Early filter during recursive listing to skip unsupported files before they enter the download flow.
  2. Defensive filter before download/document creation to prevent unsupported files from being processed if any listing edge case slips through.

It also wires allow_images into the WebDAV sync path so image extension handling follows connector policy.

Scope is intentionally limited to WebDAV for a focused bug-fix PR.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)

How was this tested?

  • Manual verification with mixed file types under the configured WebDAV path:
    • supported: .pdf, .txt, .md
    • unsupported: .exe, .bin, .dat
  • Triggered full sync and polling sync.
  • Confirmed unsupported files are skipped before download.
  • Confirmed supported files are still indexed normally.
  • Confirmed image handling follows allow_images setting.

Fixes: #13795

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. 🐞 bug Something isn't working, pull request that fix bug. labels Mar 26, 2026
@spider-yamet
Copy link
Copy Markdown
Contributor Author

@yingfeng Would love to hear your opinion on this PR. Thanks

@Magicbook1108 Magicbook1108 added the ci Continue Integration label Mar 26, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.72%. Comparing base (6a4a9de) to head (8e22a68).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #13802   +/-   ##
=======================================
  Coverage   96.72%   96.72%           
=======================================
  Files          10       10           
  Lines         702      702           
  Branches      112      112           
=======================================
  Hits          679      679           
  Misses          5        5           
  Partials       18       18           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@spider-yamet
Copy link
Copy Markdown
Contributor Author

Would appreciate your feedback @Magicbook1108 @yingfeng :)

@spider-yamet
Copy link
Copy Markdown
Contributor Author

Could you please give me opinion, @Magicbook1108 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. ci Continue Integration size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: WebDAV sync does not filter unsupported files before processing

2 participants