Skip to content

dbt-labs/snowflake_sko_hol_2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

dbt Industries Views

A multi-industry dbt project designed for flexible, industry-specific analytics with Fivetran integration. This project supports multiple industry verticals with independent deployment and selective execution capabilities.

🎯 Project Overview

This dbt project creates analytical views from industry-specific datasets synced by Fivetran. The modular structure allows you to:

  • Add multiple industries to the same project (Healthcare, Agriculture, Meteorology, etc.)
  • Deploy selectively by industry using tags or folder paths
  • Share common infrastructure while maintaining industry-specific logic
  • Scale easily by adding new industry folders as needed

Current Industries

Healthcare (CDS)

Source: PostgreSQL industries database β†’ CDS_RECORDS table
Schema: INDUSTRIES_HEALTHCARE
Views: 5 analytical views covering patient outcomes, treatment effectiveness, clinical trials, data quality, and high-risk alerts

Future industries (AGR, MET, etc.) will follow the same pattern.


πŸ“ Project Structure

dbt_industries_views/
β”œβ”€β”€ dbt_project.yml              # Project configuration with industry-specific settings
β”œβ”€β”€ profiles.yml                 # Snowflake connection template
β”œβ”€β”€ packages.yml                 # dbt dependencies
β”œβ”€β”€ .gitignore                   # Git exclusions
β”œβ”€β”€ README.md                    # This file
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ cds/                     # Healthcare industry models
β”‚   β”‚   β”œβ”€β”€ _cds__sources.yml    # Source definitions
β”‚   β”‚   β”œβ”€β”€ schema.yml           # Model documentation
β”‚   β”‚   β”œβ”€β”€ vw_cds_patient_outcomes_kpi.sql
β”‚   β”‚   β”œβ”€β”€ vw_cds_treatment_effectiveness.sql
β”‚   β”‚   β”œβ”€β”€ vw_cds_clinical_trial_performance.sql
β”‚   β”‚   β”œβ”€β”€ vw_cds_data_quality.sql
β”‚   β”‚   └── vw_cds_high_risk_patients.sql
β”‚   └── [future industries]/     # AGR, MET, etc.
└── macros/
    └── generate_schema_name.sql # Custom schema routing

πŸš€ Quick Start

Prerequisites

  • dbt Core 1.5+ installed (installation guide)
  • Snowflake account with appropriate permissions
  • Fivetran connector syncing data to Snowflake
  • Git for version control

1. Clone and Setup

# Clone the repository (or download the project files)
git clone <your-repo-url>
cd dbt_industries_views

# Install dbt dependencies
dbt deps

2. Configure Connection

Copy the profiles.yml template to ~/.dbt/profiles.yml and update with your Snowflake credentials:

industries_snowflake:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: YOUR_SNOWFLAKE_ACCOUNT
      user: YOUR_USERNAME
      password: YOUR_PASSWORD
      role: YOUR_ROLE
      database: HOL_DATABASE
      warehouse: YOUR_WAREHOUSE
      schema: INDUSTRIES_HEALTHCARE
      threads: 4

3. Test Connection

# Verify connection
dbt debug

# Compile models to check for errors
dbt compile

4. Run Models

# Run all models
dbt run

# Run only CDS healthcare models
dbt run --select cds.*

# Run models with specific tag
dbt run --select tag:healthcare

πŸ“Š CDS Healthcare Views

1. vw_cds_patient_outcomes_kpi

Purpose: Executive KPI dashboard
Metrics: Patient volume, outcome scores, treatment success, costs, satisfaction, risk indicators
Audience: Healthcare executives, clinical leadership
Query Example:

SELECT * FROM INDUSTRIES_HEALTHCARE.VW_CDS_PATIENT_OUTCOMES_KPI;

2. vw_cds_treatment_effectiveness

Purpose: Treatment outcome analysis by diagnosis and plan
Dimensions: Diagnosis, Treatment Plan
Metrics: Success rates, cost per outcome, adherence impact
Audience: Clinical teams, treatment coordinators
Query Example:

SELECT 
  DIAGNOSIS,
  TREATMENT_PLAN,
  success_rate,
  cost_per_successful_outcome
FROM INDUSTRIES_HEALTHCARE.VW_CDS_TREATMENT_EFFECTIVENESS
ORDER BY success_rate DESC;

3. vw_cds_clinical_trial_performance

Purpose: Clinical trial monitoring and effectiveness
Dimensions: Trial Name, Trial Status
Metrics: Enrollment, outcomes, safety, costs, publications
Audience: Research teams, trial coordinators
Query Example:

SELECT 
  TRIAL_NAME,
  TRIAL_STATUS,
  enrolled_patients,
  success_rate,
  efficiency_score
FROM INDUSTRIES_HEALTHCARE.VW_CDS_CLINICAL_TRIAL_PERFORMANCE
WHERE TRIAL_STATUS = 'Active';

4. vw_cds_data_quality

Purpose: Data completeness and integrity monitoring
Checks: Null percentages, duplicates, validity, freshness
Audience: Data engineers, quality assurance
Query Example:

SELECT 
  overall_quality_score,
  quality_rating,
  completeness_score,
  validity_score,
  freshness_score
FROM INDUSTRIES_HEALTHCARE.VW_CDS_DATA_QUALITY;

5. vw_cds_high_risk_patients

Purpose: Patient alerts requiring immediate attention
Filters: High readmission risk, critical vitals, poor outcomes
Audience: Clinical teams, care coordinators
Query Example:

SELECT 
  PATIENT_ID,
  priority_level,
  risk_score,
  alert_message,
  recommended_actions
FROM INDUSTRIES_HEALTHCARE.VW_CDS_HIGH_RISK_PATIENTS
WHERE priority_level IN ('Critical', 'High')
ORDER BY risk_score DESC;

πŸ”„ Fivetran dbt Core Integration

Setup Instructions

  1. Push to GitHub:

    git init
    git add .
    git commit -m "Initial dbt industries project"
    git remote add origin <your-github-repo>
    git push -u origin main
  2. Configure in Fivetran:

    • Navigate to Transformations β†’ dbt Core
    • Connect your GitHub repository
    • Set Target: prod
    • Configure credentials in Fivetran UI (not in profiles.yml)
  3. Selective Execution: Configure dbt command based on your needs:

    Run all CDS models:

    dbt run --select cds.*

    Run only KPI views:

    dbt run --select tag:kpi

    Run specific model:

    dbt run --select vw_cds_patient_outcomes_kpi
  4. Schedule:

    • Fivetran runs dbt automatically after connector syncs
    • No additional scheduling needed

πŸ—οΈ Adding New Industries

To add a new industry (e.g., Agriculture):

1. Create Industry Folder

mkdir -p models/agr

2. Update dbt_project.yml

models:
  dbt_industries_views:
    agr:
      +tags: ['agr', 'agriculture']
      +schema: industries_agriculture
      +materialized: view

vars:
  agr_source_database: 'HOL_DATABASE'
  agr_source_schema: 'INDUSTRIES_AGRICULTURE'
  agr_source_table: 'AGR_RECORDS'

3. Create Source Definition

Create models/agr/_agr__sources.yml:

version: 2

sources:
  - name: industries_agriculture
    database: "{{ var('agr_source_database') }}"
    schema: "{{ var('agr_source_schema') }}"
    tables:
      - name: agr_records
        identifier: "{{ var('agr_source_table') }}"

4. Create Views

Add SQL files in models/agr/:

  • vw_agr_crop_yields_kpi.sql
  • vw_agr_weather_impact.sql
  • etc.

5. Document Models

Create models/agr/schema.yml with model documentation.

6. Deploy

# Test locally
dbt run --select agr.*

# Push to GitHub
git add models/agr/
git commit -m "Add agriculture industry models"
git push

# Configure in Fivetran with: dbt run --select agr.*

🎨 Customization

Modify Source Tables

Update variables in dbt_project.yml:

vars:
  cds_source_database: 'YOUR_DATABASE'
  cds_source_schema: 'YOUR_SCHEMA'
  cds_source_table: 'YOUR_TABLE'

Change Materialization

Update model config in SQL file:

{{
  config(
    materialized='table',  -- Change from 'view' to 'table'
    tags=['cds', 'healthcare', 'kpi']
  )
}}

Add Custom Tags

Add tags for selective execution:

{{
  config(
    tags=['cds', 'healthcare', 'executive', 'weekly_report']
  )
}}

Then run: dbt run --select tag:weekly_report


πŸ§ͺ Testing

# Compile all models
dbt compile

# Run all models
dbt run

# Test source data
dbt test --select source:*

# Test specific model
dbt test --select vw_cds_patient_outcomes_kpi

# Generate documentation
dbt docs generate
dbt docs serve  # Opens in browser

πŸ“‹ Development Workflow

Local Development

# Create feature branch
git checkout -b feature/new-industry-views

# Develop and test
dbt run --select <your_models>
dbt test

# Commit changes
git add .
git commit -m "Add new industry views"
git push origin feature/new-industry-views

Deployment

  1. Create Pull Request in GitHub
  2. Review and merge to main branch
  3. Fivetran automatically deploys on next connector sync

πŸ” Monitoring

Data Quality

Check data quality regularly:

SELECT * FROM INDUSTRIES_HEALTHCARE.VW_CDS_DATA_QUALITY;

Fivetran Logs

Monitor in Fivetran UI:

  • Transformations β†’ Logs for dbt run history
  • Connectors β†’ Logs for sync status

Alert Integration

Query high-risk patients for alerting:

SELECT 
  PATIENT_ID,
  priority_level,
  alert_message
FROM INDUSTRIES_HEALTHCARE.VW_CDS_HIGH_RISK_PATIENTS
WHERE priority_level = 'Critical';

πŸ› οΈ Troubleshooting

Common Issues

"Column not found" errors:

  • Verify column names match exactly (case-sensitive in Snowflake)
  • Check source table with: SHOW COLUMNS IN TABLE <table>

dbt compilation fails:

  • Check YAML syntax (use 2 spaces, no tabs)
  • Validate with: dbt compile

Views empty after creation:

  • Verify source table has data
  • Check WHERE clause filters aren't too restrictive

Fivetran transformation fails:

  • Review Fivetran logs for specific error
  • Test locally first: dbt run --select <model>
  • Verify credentials are configured in Fivetran

πŸ“š Resources


🀝 Contributing

To contribute new industries or views:

  1. Follow the structure in models/cds/ as a template
  2. Document all models in schema.yml
  3. Add tests for critical fields
  4. Update this README with new industry documentation
  5. Submit Pull Request with clear description

πŸ“„ License

[Add your license information here]


✨ Project Metadata

  • Project Name: dbt_industries_views
  • Version: 1.0.0
  • dbt Version: 1.5+
  • Warehouse: Snowflake
  • Connector: Fivetran
  • Industries: Healthcare (CDS) - More to come

Questions or Issues? [Create an issue in GitHub]

About

A demo dbt Project for a hands-on workshop at Snowflake SKO 2026

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages