Skip to content

Performance: Critical N+1 query bottleneck in trips-for-location endpoint #316

@ARCoder181105

Description

@ARCoder181105

Description

The trips-for-location endpoint currently exhibits a severe N+1 query pattern. For every vehicle visible in the requested bounding box, the server executes multiple synchronous database queries inside a loop.

This results in linear O(N) performance degradation. In high-density transit areas (e.g., a downtown view with 50+ vehicles), a single API request triggers hundreds of database queries, causing significant latency and database contention.

Technical Detail

  • File: internal/restapi/trips_for_location_handler.go
  • Function: buildTripsForLocationEntries

The function iterates over the activeTrips map. Inside this loop, it calls buildScheduleForTrip, which triggers the following database queries for each vehicle:

  1. GetShapePointsByTripID (Fetch shape geometry)
  2. GetTrip (Fetch trip details)
  3. GetNextAndPreviousTripIDs (Fetch block information)
  4. GetStopsByIDs (Fetch stop coordinates for distance calculation)

Impact

  • Scalability: The endpoint effectively DDOS-es the database as the number of vehicles increases.
  • Latency: API response times degrade linearly.
  • 1 Vehicle ≈ 5 DB queries
  • 50 Vehicles ≈ 250+ DB queries (Single Request)

Steps to Reproduce

  1. Start Maglev with a GTFS feed containing active real-time vehicles.
  2. Make a request to /api/where/trips-for-location.json with a bounding box large enough to capture 20+ vehicles.
  3. Set includeSchedule=true.
  4. Observe the database logs or request latency; the query count will match N * 4 (approx).

Proposed Solution

Refactor the handler to use a Bulk Retrieval pattern:

  1. Phase 1 (Collection): Iterate over visible vehicles to collect all TripIDs and BlockIDs into slices.
  2. Phase 2 (Bulk Fetch): Execute single batch queries to fetch all Trips, Shapes, StopTimes, and Block-Trips at once.
  3. Phase 3 (Assembly): Convert the bulk data into in-memory maps and assemble the response without further DB calls.

I have a working implementation of this fix that reduces the query count to a constant ~5 queries regardless of vehicle count.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions