Skip to content

Commit dc75f81

Browse files
committed
Updated metadata and schema templates
1 parent 1fac5f4 commit dc75f81

File tree

8 files changed

+50
-40
lines changed

8 files changed

+50
-40
lines changed

CODEOWNERS

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,4 @@ dags.yaml
2626
/sql/moz-fx-data-shared-prod/contextual_services_derived/request_payload_tiles_v2 @mozilla/request_payload_reviewers
2727
/sql/moz-fx-data-shared-prod/contextual_services_derived/suggest_revenue_levers_daily_v1 @mozilla/revenue_forecasting_data_reviewers
2828
/sql/moz-fx-data-shared-prod/monitoring_derived/jobs_by_organization_v1 @mozilla/dataops
29-
# City Seen
30-
/sql_generators/clients_city_seen/templates/ @wichan @soGaussian
31-
/sql/**/clients_city_seen_v1 @wichan @soGaussian
29+

dags.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2456,7 +2456,7 @@ bqetl_bigeye_derived:
24562456
- impact/tier_3
24572457

24582458
bqetl_clients_city_seen:
2459-
schedule_interval: 0 2 * * *
2459+
schedule_interval: 0 4 * * *
24602460
default_args:
24612461
24622462
start_date: "2025-08-25"

sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/metadata.yaml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,24 @@
11
friendly_name: Fenix Clients City Seen
22
description: |-
3-
This table captures the first and last seen geo attributes for each client_id and normalized channel.
3+
This table stores the first-seen and last-seen geo attributes for each (client_id, normalized_channel).
4+
The table was initialized from stable tables (with ~2 years of retention), so the initial dates reflect the earliest/latest
5+
observations within that window. It then updates daily using live tables, replacing last-seen fields for existing
6+
clients and adding new clients with first-seen geo attributes that do not yet exist in the table.
7+
Implementation Plan: https://docs.google.com/document/d/1S8yVEwJjtJy3Pd8cn_BHRhgxllVnoxKylGCpuMyqWNQ/edit?usp=sharing
48
owners:
59
610
labels:
7-
change_controlled: true
811
incremental: true
912
schedule: daily
1013
table_type: client_level
1114
dag: bqetl_clients_city_seen
1215
scheduling:
1316
dag_name: bqetl_clients_city_seen
14-
task_group: fenix_clients_city_seen
17+
task_name: fenix_clients_city_seen_v1
1518
depends_on_past: true
1619
date_partition_parameter: null
20+
parameters:
21+
- submission_date:DATE:
1722
depends_on:
1823
- task_id: copy_deduplicate_all
1924
dag_name: copy_deduplicate
@@ -24,7 +29,6 @@ bigquery:
2429
field: last_seen_geo_date
2530
require_partition_filter: false
2631
expiration_days: null
27-
range_partitioning: null
2832
clustering:
2933
fields:
3034
- sample_id

sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/schema.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,40 +10,40 @@ fields:
1010
- name: sample_id
1111
type: INTEGER
1212
mode: NULLABLE
13-
description: sample_id derived from client_id
13+
description: A number, 0-99, that samples by client_id and allows filtering data for analysis.
1414
- name: normalized_channel
1515
type: STRING
1616
mode: NULLABLE
1717
description: The normalized channel the application is being distributed on.
1818
- name: first_seen_geo_date
1919
type: DATE
2020
mode: NULLABLE
21-
description: Date when the first seen geo data were captured.
21+
description: Date when the first seen geo fields were captured.
2222
- name: first_seen_geo_city
2323
type: STRING
2424
mode: NULLABLE
25-
description: First city captured from a client captured on first_seen_geo_date.
25+
description: City captured on first_seen_geo_date.
2626
- name: first_seen_geo_subdivision1
2727
type: STRING
2828
mode: NULLABLE
29-
description: First major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
29+
description: Major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
3030
- name: first_seen_geo_subdivision2
3131
type: STRING
3232
mode: NULLABLE
3333
description: Second major country subdivision; not applicable for most countries captured on first_seen_geo_date.
3434
- name: last_seen_geo_date
3535
type: DATE
3636
mode: NULLABLE
37-
description: Date when the last seen geo data were captured.
37+
description: Date when the last seen geo fields were captured.
3838
- name: last_seen_geo_city
3939
type: STRING
4040
mode: NULLABLE
41-
description: Last city captured from a client on last_seen_geo_city.
41+
description: City captured on last_seen_geo_city.
4242
- name: last_seen_geo_subdivision1
4343
type: STRING
4444
mode: NULLABLE
45-
description: First major country subdivision, typically a state, province, or county captured on last_seen_geo_city.
45+
description: Major country subdivision, typically a state, province, or county captured on last_seen_geo_date.
4646
- name: last_seen_geo_subdivision2
4747
type: STRING
4848
mode: NULLABLE
49-
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_city.
49+
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_date.

sql/moz-fx-data-shared-prod/firefox_desktop_derived/clients_city_seen_v1/metadata.yaml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,24 @@
11
friendly_name: Firefox Desktop Clients City Seen
22
description: |-
3-
This table captures the first and last seen geo attributes for each client_id and normalized channel.
3+
This table stores the first-seen and last-seen geo attributes for each (client_id, normalized_channel).
4+
The table was initialized from stable tables (with ~2 years of retention), so the initial dates reflect the earliest/latest
5+
observations within that window. It then updates daily using live tables, replacing last-seen fields for existing
6+
clients and adding new clients with first-seen geo attributes that do not yet exist in the table.
7+
Implementation Plan: https://docs.google.com/document/d/1S8yVEwJjtJy3Pd8cn_BHRhgxllVnoxKylGCpuMyqWNQ/edit?usp=sharing
48
owners:
59
610
labels:
7-
change_controlled: true
811
incremental: true
912
schedule: daily
1013
table_type: client_level
1114
dag: bqetl_clients_city_seen
1215
scheduling:
1316
dag_name: bqetl_clients_city_seen
14-
task_group: firefox_desktop_clients_city_seen
17+
task_name: firefox_desktop_clients_city_seen_v1
1518
depends_on_past: true
1619
date_partition_parameter: null
20+
parameters:
21+
- submission_date:DATE:
1722
depends_on:
1823
- task_id: copy_deduplicate_all
1924
dag_name: copy_deduplicate
@@ -24,7 +29,6 @@ bigquery:
2429
field: last_seen_geo_date
2530
require_partition_filter: false
2631
expiration_days: null
27-
range_partitioning: null
2832
clustering:
2933
fields:
3034
- sample_id

sql/moz-fx-data-shared-prod/firefox_desktop_derived/clients_city_seen_v1/schema.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,40 +10,40 @@ fields:
1010
- name: sample_id
1111
type: INTEGER
1212
mode: NULLABLE
13-
description: sample_id derived from client_id
13+
description: A number, 0-99, that samples by client_id and allows filtering data for analysis.
1414
- name: normalized_channel
1515
type: STRING
1616
mode: NULLABLE
1717
description: The normalized channel the application is being distributed on.
1818
- name: first_seen_geo_date
1919
type: DATE
2020
mode: NULLABLE
21-
description: Date when the first seen geo data were captured.
21+
description: Date when the first seen geo fields were captured.
2222
- name: first_seen_geo_city
2323
type: STRING
2424
mode: NULLABLE
25-
description: First city captured from a client captured on first_seen_geo_date.
25+
description: City captured on first_seen_geo_date.
2626
- name: first_seen_geo_subdivision1
2727
type: STRING
2828
mode: NULLABLE
29-
description: First major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
29+
description: Major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
3030
- name: first_seen_geo_subdivision2
3131
type: STRING
3232
mode: NULLABLE
3333
description: Second major country subdivision; not applicable for most countries captured on first_seen_geo_date.
3434
- name: last_seen_geo_date
3535
type: DATE
3636
mode: NULLABLE
37-
description: Date when the last seen geo data were captured.
37+
description: Date when the last seen geo fields were captured.
3838
- name: last_seen_geo_city
3939
type: STRING
4040
mode: NULLABLE
41-
description: Last city captured from a client on last_seen_geo_city.
41+
description: City captured on last_seen_geo_city.
4242
- name: last_seen_geo_subdivision1
4343
type: STRING
4444
mode: NULLABLE
45-
description: First major country subdivision, typically a state, province, or county captured on last_seen_geo_city.
45+
description: Major country subdivision, typically a state, province, or county captured on last_seen_geo_date.
4646
- name: last_seen_geo_subdivision2
4747
type: STRING
4848
mode: NULLABLE
49-
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_city.
49+
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_date.

sql_generators/clients_city_seen_v1/templates/metadata.yaml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,24 @@
11
friendly_name: {{ app_value }} Clients City Seen
22
description: |-
3-
This table captures the first and last seen geo attributes for each client_id and normalized channel.
3+
This table stores the first-seen and last-seen geo attributes for each (client_id, normalized_channel).
4+
The table was initialized from stable tables (with ~2 years of retention), so the initial dates reflect the earliest/latest
5+
observations within that window. It then updates daily using live tables, replacing last-seen fields for existing
6+
clients and adding new clients with first-seen geo attributes that do not yet exist in the table.
7+
Implementation Plan: https://docs.google.com/document/d/1S8yVEwJjtJy3Pd8cn_BHRhgxllVnoxKylGCpuMyqWNQ/edit?usp=sharing
48
owners:
59
610
labels:
7-
change_controlled: true
811
incremental: true
912
schedule: daily
1013
table_type: client_level
1114
dag: bqetl_clients_city_seen
1215
scheduling:
1316
dag_name: bqetl_clients_city_seen
14-
task_group: {{ app_name }}_clients_city_seen
17+
task_name: {{ app_name }}_{{ table_name }}
1518
depends_on_past: true
1619
date_partition_parameter: null
20+
parameters:
21+
- submission_date:DATE:{{ds}}
1722
depends_on:
1823
- task_id: copy_deduplicate_all
1924
dag_name: copy_deduplicate
@@ -24,7 +29,6 @@ bigquery:
2429
field: last_seen_geo_date
2530
require_partition_filter: false
2631
expiration_days: null
27-
range_partitioning: null
2832
clustering:
2933
fields:
3034
- sample_id

sql_generators/clients_city_seen_v1/templates/schema.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,40 +10,40 @@ fields:
1010
- name: sample_id
1111
type: INTEGER
1212
mode: NULLABLE
13-
description: sample_id derived from client_id
13+
description: A number, 0-99, that samples by client_id and allows filtering data for analysis.
1414
- name: normalized_channel
1515
type: STRING
1616
mode: NULLABLE
1717
description: The normalized channel the application is being distributed on.
1818
- name: first_seen_geo_date
1919
type: DATE
2020
mode: NULLABLE
21-
description: Date when the first seen geo data were captured.
21+
description: Date when the first seen geo fields were captured.
2222
- name: first_seen_geo_city
2323
type: STRING
2424
mode: NULLABLE
25-
description: First city captured from a client captured on first_seen_geo_date.
25+
description: City captured on first_seen_geo_date.
2626
- name: first_seen_geo_subdivision1
2727
type: STRING
2828
mode: NULLABLE
29-
description: First major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
29+
description: Major country subdivision, typically a state, province, or county captured on first_seen_geo_date.
3030
- name: first_seen_geo_subdivision2
3131
type: STRING
3232
mode: NULLABLE
3333
description: Second major country subdivision; not applicable for most countries captured on first_seen_geo_date.
3434
- name: last_seen_geo_date
3535
type: DATE
3636
mode: NULLABLE
37-
description: Date when the last seen geo data were captured.
37+
description: Date when the last seen geo fields were captured.
3838
- name: last_seen_geo_city
3939
type: STRING
4040
mode: NULLABLE
41-
description: Last city captured from a client on last_seen_geo_city.
41+
description: City captured on last_seen_geo_city.
4242
- name: last_seen_geo_subdivision1
4343
type: STRING
4444
mode: NULLABLE
45-
description: First major country subdivision, typically a state, province, or county captured on last_seen_geo_city.
45+
description: Major country subdivision, typically a state, province, or county captured on last_seen_geo_date.
4646
- name: last_seen_geo_subdivision2
4747
type: STRING
4848
mode: NULLABLE
49-
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_city.
49+
description: Second major country subdivision; not applicable for most countries captured on last_seen_geo_date.

0 commit comments

Comments
 (0)