Skip to content

Conversation

tiffanychu90
Copy link
Member

@tiffanychu90 tiffanychu90 commented Jul 24, 2025

Description

Create an intermediate table that compares scheduled stop positions with the vehicle locations path (linestring):

  • new intermediate table: int_gtfs_rt_vs_sched__stops_served_vehicle_positions
  • expand int_gtfs_schedule__stop_times_grouped (trip grain) to capture additional columns from dim_stops (stop_id, pt_geom) to use
  • add new mart_gtfs.fct_vehicle_positions_stop_metrics, which will summarize the GTFS RT vp-derived stop grain metrics
    • 1st metric: stop is serviced by vehicle positions...how many vp trips get close to a stop (within 10 or 25 meters)

Resolves #4034 and works on the vehicle position derived portion of stop grain metrics for #4101

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

Post-merge follow-ups

  • No action required
  • Actions required (specified below)

@tiffanychu90 tiffanychu90 changed the title Update intermediate Expand intermediate stop times Create vehicle position derived stop grain metrics Jul 24, 2025
Copy link

github-actions bot commented Jul 24, 2025

Terraform plan in iac/cal-itp-data-infra-staging/airflow/us

Plan: 4 to add, 5 to change, 0 to destroy.
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+   create
!~  update in-place

Terraform will perform the following actions:

  # google_storage_bucket_object.calitp-staging-composer-catalog will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer-catalog" {
!~      content             = (sensitive value)
!~      crc32c              = "ywRuIQ==" -> (known after apply)
!~      detect_md5hash      = "5cIZgRn7xQWZrtF6cTUFIQ==" -> "different hash"
!~      generation          = 1755275090821996 -> (known after apply)
        id                  = "calitp-staging-composer-data/warehouse/target/catalog.json"
!~      md5hash             = "5cIZgRn7xQWZrtF6cTUFIQ==" -> (known after apply)
        name                = "data/warehouse/target/catalog.json"
#        (16 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/intermediate/gtfs/_int_gtfs.yaml"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
!~      crc32c              = "1EOB9A==" -> (known after apply)
!~      detect_md5hash      = "qSASDPn6cn8Jdtmaoaz0FQ==" -> "different hash"
!~      generation          = 1754523275842512 -> (known after apply)
        id                  = "calitp-staging-composer-data/warehouse/models/intermediate/gtfs/_int_gtfs.yaml"
!~      md5hash             = "qSASDPn6cn8Jdtmaoaz0FQ==" -> (known after apply)
        name                = "data/warehouse/models/intermediate/gtfs/_int_gtfs.yaml"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
+       bucket         = "calitp-staging-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
!~      crc32c              = "gElm9g==" -> (known after apply)
!~      detect_md5hash      = "yDLEGvytnnWa5vUOCwQvEw==" -> "different hash"
!~      generation          = 1749663116270396 -> (known after apply)
        id                  = "calitp-staging-composer-data/warehouse/models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"
!~      md5hash             = "yDLEGvytnnWa5vUOCwQvEw==" -> (known after apply)
        name                = "data/warehouse/models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/mart/gtfs/fct_vehicle_locations_path.sql"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
!~      crc32c              = "d+JBfA==" -> (known after apply)
!~      detect_md5hash      = "dvuSMtXVHhvZNA9RlS/FIw==" -> "different hash"
!~      generation          = 1752161965968081 -> (known after apply)
        id                  = "calitp-staging-composer-data/warehouse/models/mart/gtfs/fct_vehicle_locations_path.sql"
!~      md5hash             = "dvuSMtXVHhvZNA9RlS/FIw==" -> (known after apply)
        name                = "data/warehouse/models/mart/gtfs/fct_vehicle_locations_path.sql"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
+       bucket         = "calitp-staging-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/mart/gtfs/test_vehicle_locations.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
+       bucket         = "calitp-staging-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/test_vehicle_locations.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/test_vehicle_locations.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-staging-composer-dags["models/mart/gtfs/test_vehicle_locations_path.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
+       bucket         = "calitp-staging-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/test_vehicle_locations_path.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/test_vehicle_locations_path.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-staging-composer-manifest will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer-manifest" {
!~      content             = (sensitive value)
!~      crc32c              = "F/xmNg==" -> (known after apply)
!~      detect_md5hash      = "ZSqx6z78ExlzwiEB+Rzk1w==" -> "different hash"
!~      generation          = 1755275092539714 -> (known after apply)
        id                  = "calitp-staging-composer-data/warehouse/target/manifest.json"
!~      md5hash             = "ZSqx6z78ExlzwiEB+Rzk1w==" -> (known after apply)
        name                = "data/warehouse/target/manifest.json"
#        (16 unchanged attributes hidden)
    }

Plan: 4 to add, 5 to change, 0 to destroy.

📝 Plan generated in Plan Airflow DAGs #492

Copy link

github-actions bot commented Jul 24, 2025

Terraform plan in iac/cal-itp-data-infra/airflow/us

Plan: 4 to add, 5 to change, 0 to destroy.
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+   create
!~  update in-place

Terraform will perform the following actions:

  # google_storage_bucket_object.calitp-composer-catalog will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-catalog" {
!~      content             = (sensitive value)
!~      crc32c              = "prS3hQ==" -> (known after apply)
!~      detect_md5hash      = "Q+BPTUMl+rLSPAiZUOU8ug==" -> "different hash"
!~      generation          = 1755538684362691 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/target/catalog.json"
!~      md5hash             = "Q+BPTUMl+rLSPAiZUOU8ug==" -> (known after apply)
        name                = "data/warehouse/target/catalog.json"
#        (16 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/intermediate/gtfs/_int_gtfs.yaml"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-dags" {
!~      crc32c              = "1EOB9A==" -> (known after apply)
!~      detect_md5hash      = "qSASDPn6cn8Jdtmaoaz0FQ==" -> "different hash"
!~      generation          = 1754523290759543 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/models/intermediate/gtfs/_int_gtfs.yaml"
!~      md5hash             = "qSASDPn6cn8Jdtmaoaz0FQ==" -> (known after apply)
        name                = "data/warehouse/models/intermediate/gtfs/_int_gtfs.yaml"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-composer-dags" {
+       bucket         = "calitp-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/intermediate/gtfs/int_gtfs_rt_vs_sched__stops_served_vehicle_positions.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-dags" {
!~      crc32c              = "gElm9g==" -> (known after apply)
!~      detect_md5hash      = "yDLEGvytnnWa5vUOCwQvEw==" -> "different hash"
!~      generation          = 1751416661143892 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"
!~      md5hash             = "yDLEGvytnnWa5vUOCwQvEw==" -> (known after apply)
        name                = "data/warehouse/models/intermediate/gtfs/int_gtfs_schedule__stop_times_grouped.sql"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/mart/gtfs/fct_vehicle_locations_path.sql"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-dags" {
!~      crc32c              = "d+JBfA==" -> (known after apply)
!~      detect_md5hash      = "dvuSMtXVHhvZNA9RlS/FIw==" -> "different hash"
!~      generation          = 1752161967800132 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/models/mart/gtfs/fct_vehicle_locations_path.sql"
!~      md5hash             = "dvuSMtXVHhvZNA9RlS/FIw==" -> (known after apply)
        name                = "data/warehouse/models/mart/gtfs/fct_vehicle_locations_path.sql"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-composer-dags" {
+       bucket         = "calitp-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/fct_vehicle_positions_stop_metrics.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/mart/gtfs/test_vehicle_locations.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-composer-dags" {
+       bucket         = "calitp-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/test_vehicle_locations.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/test_vehicle_locations.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-composer-dags["models/mart/gtfs/test_vehicle_locations_path.sql"] will be created
+   resource "google_storage_bucket_object" "calitp-composer-dags" {
+       bucket         = "calitp-composer"
+       content        = (sensitive value)
+       content_type   = (known after apply)
+       crc32c         = (known after apply)
+       detect_md5hash = "different hash"
+       generation     = (known after apply)
+       id             = (known after apply)
+       kms_key_name   = (known after apply)
+       md5hash        = (known after apply)
+       md5hexhash     = (known after apply)
+       media_link     = (known after apply)
+       name           = "data/warehouse/models/mart/gtfs/test_vehicle_locations_path.sql"
+       output_name    = (known after apply)
+       self_link      = (known after apply)
+       source         = "../../../../warehouse/models/mart/gtfs/test_vehicle_locations_path.sql"
+       storage_class  = (known after apply)
    }

  # google_storage_bucket_object.calitp-composer-manifest will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-manifest" {
!~      content             = (sensitive value)
!~      crc32c              = "M2fl7Q==" -> (known after apply)
!~      detect_md5hash      = "t2qjY23MCKTGKyOj0YMyoQ==" -> "different hash"
!~      generation          = 1755538685950783 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/target/manifest.json"
!~      md5hash             = "t2qjY23MCKTGKyOj0YMyoQ==" -> (known after apply)
        name                = "data/warehouse/target/manifest.json"
#        (16 unchanged attributes hidden)
    }

Plan: 4 to add, 5 to change, 0 to destroy.

📝 Plan generated in Plan Airflow DAGs #492

@tiffanychu90 tiffanychu90 marked this pull request as draft July 25, 2025 23:01
@tiffanychu90 tiffanychu90 force-pushed the expand-intermediate-stop-times branch from 75c5c38 to 98fb5d6 Compare July 28, 2025 16:48
@erikamov erikamov force-pushed the expand-intermediate-stop-times branch from 98fb5d6 to 5181804 Compare August 1, 2025 17:08
@tiffanychu90 tiffanychu90 force-pushed the expand-intermediate-stop-times branch from 5181804 to 03a83d9 Compare August 20, 2025 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proof of Concept: Using GTFS-RT Vehicle Positions to check for GTFS Schedule stop and shape accuracy

1 participant