Skip to content

Conversation

albertisfu
Copy link
Owner

@albertisfu albertisfu commented Jul 4, 2025

@mlissner Here’s the PR that adds support for filtering nested fields and deferring unrequested fields. So we can agree on the overall approach before submitting the PR to the maintainer’s repository.

Basically this update enhances the DynamicFieldsMixin to support filtering nested fields in child serializers using the existing fields= and omit= query parameters.

These parameters can be used in combination with top-level fields. For example:
?omit=description,recap_documents__plain_text
?fields=description,recap_documents__plain_text

Both omit and fields can also be combined in a single request:
?fields=id,name,recap_documents__pacer_doc_id,recap_documents__plain_text&omit=entry_number,recap_documents__plain_text

Here some examples:
api/rest/v4/docket-entries/?fields=id,recap_documents__page_count
Screenshot 2025-07-04 at 5 54 36 p m

/api/rest/v4/docket-entries/?omit=entry_number,recap_documents__plain_text
Screenshot 2025-07-04 at 5 55 50 p m

Deferred fields

In order to defer filtered fields, the ViewSet’s queryset needs to be aware of which fields are omitted in the serializer. For that, a new method get_deferred_model_fields has been added to the DynamicFieldsMixin.

  • If omit is used, the omitted fields are passed directly.
  • If fields is used, it identifies all model fields with corresponding DB columns and defers those not included in the request.

A new DeferredFieldsMixin has also been introduced for use in ViewSets where you want this deferring behavior. It calls get_deferred_model_fields from the serializer (which must also use DynamicFieldsMixin) and modifies the queryset accordingly.

This mixin overrides get_queryset, applying .defer() to top-level fields and .prefetch_related() with defer to nested relations.

For instance, the following request:
api/rest/v4/docket-entries/?fields=recap_documents__plain_text

Results in a query that only fetches the plain_text column from the related RECAPDocument model:
Screenshot 2025-07-04 at 6 13 06 p m

However during my testing I did notice a potential conflict with the existing queryset in DocketEntryViewSet:

queryset = (
        DocketEntry.objects.select_related(
            "docket",  # For links back to dockets
        )
        .prefetch_related(
            "recap_documents",  # Sub items
            "recap_documents__tags",  # Sub-sub items
            "tags",  # Tags on docket entries
        )
        .order_by("-id")
    )

This prefetch logic might interfere with the prefetch_related applied by DeferredFieldsMixin. I’m currently analyzing the side effects of this mixin to determine whether it introduces issues for us or for other users of drf-dynamic-fields. I’ll follow up with my findings.

@albertisfu albertisfu marked this pull request as ready for review July 5, 2025 00:16
@mlissner
Copy link

mlissner commented Jul 8, 2025

Interesting. So if we were writing documentation, we'd say:

  1. Use the DynamicFieldsMixin in your serializer to allow the field and omit parameters to work on the root object (but not nested fields).
  2. If you want the database queries modified as well and if you want nested fields to work on your omit and fields parameters, add the DeferredFieldsMixin to your ViewSet.

I assume it's as simple as that (assuming you work out the last details of the prefetch)?

@albertisfu
Copy link
Owner Author

@mlissner

Use the DynamicFieldsMixin in your serializer to allow the field and omit parameters to work on the root object (but not nested fields).

Well, actually, if you're already using DynamicFieldsMixin, it will allow the fields and omit parameters to filter both root and nested fields.

If you want the database queries modified as well and if you want nested fields to work on your omit and fields parameters, add the DeferredFieldsMixin to your ViewSet.

In this case, DeferredFieldsMixin is only used to ensure that filtered fields are excluded from database queries.

  • I've already applied some tweaks to fix the issues I previously encountered when testing these changes in our codebase.

There were two issues:

  • In the old approach, I was using to_representation to prune fields from the nested serializer. While this correctly omitted the payload in the response, it caused problems because to_representation is called after the fields are accessed. As a result, the DeferredFieldsMixin couldn’t work properly, fields were deferred but still accessed, leading to many additional queries. To resolve this, nested fields are now dropped within the fields() method, ensuring no database access occurs for those fields.
  • The second issue was related to querysets defined in the ViewSet that use complex attributes like select_related or prefetch_related.
queryset = (
        DocketEntry.objects.select_related(
            "docket",  # For links back to dockets
        )
        .prefetch_related(
            "recap_documents",  # Sub items
            "recap_documents__tags",  # Sub-sub items
            "tags",  # Tags on docket entries
        )
        .order_by("-id")
    )

The problem is that you can't defer a field and use select_related on it at the same time. So, in this new approach, any select_related field is removed from the list of deferred fields.

Also, if the original queryset includes only() or defer() calls for top-level fields, then no additional deferring is applied to filtered fields. This assumes that the selected fields are already tailored for the API endpoint to function correctly. Otherwise, you could end up combining only() and defer() in conflicting ways.

Another issue is that if the original queryset contains prefetch_related lookups, they need to be reordered to avoid conflicts. Specifically, the Prefetch objects added by DeferredFieldsMixin (to defer nested fields) should be applied first, followed by the original prefetch_related objects.

Additionally, you should not defer the foreign key field that links the nested object to the root instance, as this field is required for rendering the nested relationship. If you defer it, additional queries will be triggered. For example, in the DocketEntry serializer, you shouldn't defer recap_documents__docket_entry_id. If the user includes this field in the omit list, it is removed from the payload but ignored during the deferring process.

Lastly, if the original queryset uses values() or annotations, the deferring logic is completely skipped, values() because the query explicitly defines the return fields, and annotations because they can conflict with deferred fields, requiring additional DB queries.

This approach should be working well for our use case.

However, the logic inside DeferredFieldsMixin is somewhat complex and based on design decisions that make sense for us but may not be applicable to other projects or users. In fact, this mixin might require future adjustments if we change the queryset structure or expect different behavior. Because of this, I’m unsure whether it’s a good idea to include DeferredFieldsMixin in the drf-dynamic-fields PR, or if it would be better to keep it within CourtListener.

DeferredFieldsMixin depends on a new method introduced in DynamicFieldsMixin to retrieve the fields that should be deferred. So, an alternative would be to submit only the changes to DynamicFieldsMixin, including the new get_deferred_model_fields method, and document that this method returns the filtered fields suitable for deferring. This way, users can implement their own DeferredFieldsMixin tailored to their needs.

Let me know what do you think.

@mlissner
Copy link

mlissner commented Jul 9, 2025

This is great. I agree that it sounds best to push our changes to the DynamicFieldsMixin upsteam if we can. That gets us:

  1. The nested fields can be excluded from the result.

  2. The get_deferred_model_fields method is available.

That seems best, since the maintainer didn't want a lot of additional complexity.

A few comments on implementation....

So, in this new approach, any select_related field is removed from the list of deferred fields.

Meaning that if your queryset has a select_related with field foo, and that field is omited, then the omit doesn't do anything — the select_related wins? That'll be hard to document and to explain to users. I assume there's not much we can do about this though, right?

Also, if the original queryset includes only() or defer() calls for top-level fields, then no additional deferring is applied to filtered fields.

Hm. Could the new defer and only statements be additive:

  • If the original queryset contains a defer statement, you can add additional defer statements.
  • If the original queryset contains an only statement, you can add additional only statements
  • Using a fields or only parameter can never add a field that's excluded from the original queryset by an only or a defer at the root level.

Or is this horribly complex?

Lastly, if the original queryset uses values() or annotations, the deferring logic is completely skipped, values() because the query explicitly defines the return fields...

Hm, this seems fine since we don't do this, but I could see the logic above applying here too.

Perhaps you made an API available using values for performance, but you still want to allow clients to defer fields? If this is easy to do, it seems better to me to make it consistent.


Altogether this looks nice. It'll fix our problem via the viewset mixin, and it'll fix an old bug in the dynamic fields mixin. Great stuff!

@albertisfu
Copy link
Owner Author

Thanks @mlissner for your feedback!

I think the next step I'll take is to split this PR and submit one to the maintainer's repo with only the changes in DynamicFieldsMixin, which includes the get_deferred_model_fields method.

Then, I'll open a PR in CourtListener that includes the DeferredFieldsMixin and applies your suggestions on how to handle field conflicts.

Does that sound good?

@mlissner
Copy link

Sorry, Alberto, I'm just catching up. It looks like you went ahead and did this, thank you. What do you think of the maintainer's feedback?

@albertisfu
Copy link
Owner Author

albertisfu commented Jul 10, 2025

@mlissner Yeah, I see that they prefer to keep the original mixin as untouched as possible.

The maintainer has opened a PR that modifies the approach by introducing a new mixin, NestedDynamicFieldsMixin, which lets nested serializers filter themselves instead of relying on the parent serializer:

dbrgn#43

So users can keep using DynamicFieldsMixin for serializers without nested objects, and switch to NestedDynamicFieldsMixin for cases like ours. The logic for determining which fields to defer was removed, but we can still pull that information from the request and manage the whole deferred logic ourselves.

I tested this approach in our codebase and it worked with only minor changes. I’ve offered to help the maintainer fix those issues, refine the logic, and extend the tests based on their PR. If they agree, we’ll be much closer to a solution that we can use and they can merge.

@mlissner
Copy link

Fantastic, Alberto, that sounds perfect.

@albertisfu
Copy link
Owner Author

Just an update. The new PR is ready here:
dbrgn#44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants