-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
User story / feature request
Create a data product that will visualize / map the stop-level metrics derived from GTFS RT trip updates.
- Materialize a week for stop time related tables - materialize 1 week's worth of stop time update metrics #4104
- Create all tables for stop grain metrics and join into 1 stop grain df
-
scheduled
: updatefct_daily_scheduled_stops
-
trip_updates (stop_time_updates)
:fct_trip_updates_stop_metrics
GTFS RT sample daily stop time metrics with daily stop and trip aggregations #4172 -
vehicle_positions
:fct_vp_stop_summaries
- additional intermediate table needed at trip-stop grain
int_gtfs_rt_vs_sched__stops_served_vehicle_positions
- additional intermediate table needed at trip-stop grain
-
- Aggregates daily-trip-stop grain to week-stop-grain
- aggregated for 1st pass portfolio, if it looks good, add a dbt model for this
- Figure out table configuration settings as part of
#4172
- Exploratory work to make sure metrics make sense
- 1st pass portfolio: daily stop metrics for 2 week sample + weekday/Sat/Sun aggregation by stop: Plot RT stop metrics data-analyses#1680
- refactor after 1st pass portfolio: Add more stop time metrics #4308
- 2nd pass portfolio: 2nd draft stop metrics report data-analyses#1732
- Expand the data product and look at Jan-Jul 2025
Notes
Use the week's worth of data to figure out certain configurations for these mart tables
- incrementalize some views
- Currently, these tables are all
views
. It's not immediately usable in a data product becausedt
/hour
orbase64_url
must be used in the partition elimination upstream. This error will come up if we keep everything as views:Cannot query over table 'cal-itp-data-infra-staging.external_gtfs_rt_v2.trip_updates' without a filter over column(s) 'base64_url', 'dt', 'hour' that can be used for partition elimination
- The pattern to copy is done in
#4172
for how incremental models work. - Specifically, configs with
insert_overwrite
, sometimes needing to get the grain defined to useunique_key='key'
- Make sure joins are making use of
partition
orcluster
columns in BQ
- Currently, these tables are all
- Take a look at how metrics look when they're averaged to stop grain.
- Are there outliers
- Do certain operators look funky
- Gather learnings to inform filters needed within data product
- Go through BQ job history and look at specifically the dbt operation that is taking place to merge, append, etc
- Need to add docs related to
staging
profile withinprofiles.yml
Metadata
Metadata
Assignees
Labels
No labels