-
Notifications
You must be signed in to change notification settings - Fork 1.8k
endpoint import optimize #13521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
endpoint import optimize #13521
Conversation
|
wdyt @kiblik ("endpoint guru") |
🟡 Please give this pull request extra attention during review.This pull request may introduce a potential mass-assignment vulnerability: Endpoint.objects.create(**kwargs) is used with kwargs coming from the function arguments, and if those kwargs originate from unvalidated user input (e.g., raw request data) an attacker could set multiple model fields via **kwargs; if the function is only called with validated/internal values there is no issue.
🟡 Potential Mass Assignment Vulnerability in
|
| Vulnerability | Potential Mass Assignment Vulnerability |
|---|---|
| Description | The function calls Endpoint.objects.create(**kwargs) where kwargs come from the function arguments. If those kwargs include untrusted user input that maps to model fields, this is mass assignment. Whether it is actually vulnerable depends on the source of kwargs: if endpoint_get_or_create is only ever called with validated/internal values, there's no vulnerability; if it is called with raw request data (e.g. request.POST or unfiltered input), then this is a mass-assignment vulnerability because multiple model attributes can be set via the **kwargs expansion. |
django-DefectDojo/dojo/endpoint/utils.py
Lines 54 to 80 in 62501b4
| def endpoint_get_or_create(**kwargs): | |
| # This code looks a bit ugly/complicated. | |
| # But this method is called so frequently that we need to optimize it. | |
| # It executes at most one SELECT and one optional INSERT. | |
| qs = endpoint_filter(**kwargs) | |
| # Fetch up to two matches in a single round-trip. This covers | |
| # the common cases efficiently: zero (create) or one (reuse). | |
| matches = list(qs.order_by("id")[:2]) | |
| if not matches: | |
| # Most common case: nothing exists yet | |
| return Endpoint.objects.create(**kwargs), True | |
| if len(matches) == 1: | |
| # Common case: exactly one existing endpoint | |
| return matches[0], False | |
| logger.warning( | |
| f"Endpoints in your database are broken. " | |
| f"Please access {reverse('endpoint_migrate')} and migrate them to new format or remove them.", | |
| ) | |
| # Get the oldest endpoint first, and return that instead | |
| # a datetime is not captured on the endpoint model, so ID | |
| # will have to work here instead | |
| return matches[0], False | |
| def clean_hosts_run(apps, change): |
All finding details can be found in the DryRun Security Dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
|
Quickly looking at the change, looks like a good place for (shameless self promo) https://github.com/fopina/django-bulk-update-or-create |
|
Interesting. My search for a solution didn't bring me your library. I did think about writing similar logic, but decided not to because A) the endpoint model doesn't have a single primary key field B) there probably will be some big changes coming around endpoints early next year which could make the investment right now useless. But good to know this library, it might be handy elsewhere or later on after the big endpoint change. |
The way endpoints are handled during import has a large impact on performance.
The fastest way would be to handle them in batches, but that's not a trivial change.
There are some low hanging fruits that we can implement.
This PR and already reduces the number of queries significantly.
The Atomic wrapper has been removed as it doesn't do anything in this scenario with only one select and one write.