Fix N+1 query pattern in task instance states and count endpoints#60352
Fix N+1 query pattern in task instance states and count endpoints#60352steveahnahn wants to merge 3 commits intoapache:mainfrom
Conversation
|
This looks good at first glance. But as the filter based on map index is a new addition (and likely not to be covered by existing tests), I think it would be worth adding a small test to lock in the expected behavior and guard against future regressions. In particular, it would be a good idea to cover:
For (1), you could test both code paths by passing and omitting |
|
Thanks for the review, the scenarios you mentioned are already covered in airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_task_instances.py:
The existing parametrized cases cover mapped tasks with multiple map indices, default behavior without map_index, and filtering with task_group_id. These tests will now exercise the new db-level filtering path. I did end up adding one missed test combination, filtering by map_index without providing task_group_id |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
@SameerMesiah97 it has been some time but tagging for an approval, thanks! |
Problem
The previous implementation fetched all task instances from the database and then filtered by
map_indexin Python. For DAGs with mapped tasks containing large map indices, this caused unnecessary database load and memory usage.Solution
Push the map_index filter to the SQL query, allowing the database to handle filtering efficiently:
map_indexfiltering from Python to SQL inget_task_instance_statesandget_task_instance_countendpointsmap_indexparameter to_get_group_taskshelper function to filter at the database level