Skip to content

Commit

Permalink
Merge pull request #3675 from cal-itp/add-labels-to-models
Browse files Browse the repository at this point in the history
Add labels to DBT models
  • Loading branch information
erikamov authored Feb 5, 2025
2 parents d6a16c9 + 30595dc commit 36b9127
Show file tree
Hide file tree
Showing 2 changed files with 179 additions and 19 deletions.
150 changes: 131 additions & 19 deletions warehouse/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,42 +39,154 @@ models:
staging:
+materialized: view
schema: staging
amplitude:
+labels:
domain: staging
dataset: benefits
audit:
+labels:
domain: staging
dataset: audit
gtfs:
+labels:
domain: staging
dataset: gtfs
gtfs_quality:
+labels:
domain: staging
dataset: gtfs_quality
ntd:
+labels:
domain: staging
dataset: ntd
ntd_annual_reporting:
+labels:
domain: staging
dataset: ntd_annual_reporting
ntd_ridership:
+labels:
domain: staging
dataset: ntd_ridership
ntd_safety_and_security:
+labels:
domain: staging
dataset: ntd_safety_and_security
ntd_validation:
+labels:
domain: staging
dataset: ntd_validation
payments:
+labels:
domain: staging
dataset: payments
rt:
+labels:
domain: staging
dataset: gtfs
state_geoportal:
+labels:
domain: staging
dataset: state_geoportal
transit_database:
+labels:
domain: staging
dataset: transit_database

intermediate:
gtfs:
+labels:
domain: intermediate
dataset: gtfs
gtfs_quality:
+labels:
domain: intermediate
dataset: gtfs_quality
guidelines_checks:
+materialized: table
ntd:
+labels:
domain: intermediate
dataset: ntd
ntd_validation:
+labels:
domain: intermediate
dataset: ntd_validation
payments:
+labels:
domain: intermediate
dataset: payments
transit_database:
+labels:
domain: intermediate
dataset: transit_database

mart:
transit_database:
schema: mart_transit_database
transit_database_latest:
schema: mart_transit_database_latest
gtfs_quality:
schema: mart_gtfs_quality
audit:
+labels:
domain: mart
dataset: audit
schema: mart_audit
benefits:
+labels:
domain: mart
dataset: benefits
schema: mart_benefits
gtfs:
+labels:
domain: mart
dataset: gtfs
schema: mart_gtfs
gtfs_quality:
+labels:
domain: mart
dataset: gtfs_quality
schema: mart_gtfs_quality
gtfs_schedule_latest:
+labels:
domain: mart
dataset: gtfs_schedule_latest
schema: mart_gtfs_schedule_latest
ad_hoc:
schema: mart_ad_hoc
audit:
schema: mart_audit
ntd:
+labels:
domain: mart
dataset: ntd
schema: mart_ntd
payments:
+materialized: table
schema: mart_payments
benefits:
schema: mart_benefits
ntd_validation:
schema: mart_ntd_validation
ntd_annual_reporting:
+materialized: table
+labels:
domain: mart
dataset: ntd_annual_reporting
schema: mart_ntd_annual_reporting
ntd_ridership:
+materialized: table
+labels:
domain: mart
dataset: ntd_ridership
schema: mart_ntd_ridership
ntd_safety_and_security:
+materialized: table
+labels:
domain: mart
dataset: ntd_safety_and_security
schema: mart_ntd_safety_and_security
ntd_ridership:
ntd_validation:
+labels:
domain: mart
dataset: ntd_validation
schema: mart_ntd_validation
payments:
+materialized: table
schema: mart_ntd_ridership
+labels:
domain: mart
dataset: payments
schema: mart_payments
transit_database:
+labels:
domain: mart
dataset: transit_database
schema: mart_transit_database
transit_database_latest:
+labels:
domain: mart
dataset: transit_database_latest
schema: mart_transit_database_latest
48 changes: 48 additions & 0 deletions warehouse/seeds/_seeds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ seeds:
- name: miles_traveled
description: |
A matrix of the distance between origin/destination pairs in miles.
config:
labels:
domain: seeds
dataset: payments
columns:
- name: location_name
description: On location in the O/D pair
Expand All @@ -19,6 +23,10 @@ seeds:
- name: ntd_modes_to_full_names
description: |
A list of ntd 2 letter mode codes and their full names
config:
labels:
domain: seeds
dataset: ntd
columns:
- name: ntd_mode_abbreviation
description: The two letter abbreviation mode
Expand All @@ -28,6 +36,10 @@ seeds:
description: The mode's full name

- name: payments_entity_mapping
config:
labels:
domain: seeds
dataset: payments
columns:
- name: gtfs_dataset_source_record_id
description: Unversioned key to dim_gtfs_datasets natural key from Airtable.
Expand All @@ -45,6 +57,10 @@ seeds:
description: |
A list of validation codes output by the GTFS RT validator, and their severities.
Originally sourced from https://docs.google.com/spreadsheets/d/1GDDaDlsBPCYn3dtYPSABnce9ns3ekJ8Jzfgyy56lZz4/edit#gid=617612870.
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -56,6 +72,10 @@ seeds:
description: |
A list of validation codes output by the GTFS Schedule validator, and their severities.
Originally sourced from https://docs.google.com/spreadsheets/d/1GDDaDlsBPCYn3dtYPSABnce9ns3ekJ8Jzfgyy56lZz4/edit#gid=0.
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: name
tests:
Expand All @@ -67,6 +87,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v2.0.0 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/archive/refs/tags/v2.0.0.zip
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -87,6 +111,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v3.1.1 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/archive/refs/tags/v3.1.1.zip
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -107,6 +135,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v4.0.0 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/archive/refs/tags/v4.0.0.zip
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -127,6 +159,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v4.1.0 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/archive/refs/tags/v4.1.0.zip
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -147,6 +183,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v4.2.0 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/archive/refs/tags/v4.2.0.zip
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand All @@ -167,6 +207,10 @@ seeds:
A list of validation codes output by the GTFS Schedule validator, and their severities and descriptions.
This data was manually parsed from the contents of the RULES.md file in the v5.0.0 release of the validator,
sourced from: https://github.com/MobilityData/gtfs-validator/releases/tag/v5.0.0
config:
labels:
domain: seeds
dataset: gtfs_quality
columns:
- name: code
tests:
Expand Down Expand Up @@ -200,6 +244,10 @@ seeds:
There are also a few records in here that were manually added after looking at the 2021 NTD
data and comparing that to see if any records were missing.
config:
labels:
domain: seeds
dataset: transit_database
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
Expand Down

0 comments on commit 36b9127

Please sign in to comment.