databrickslabs
diff --git a/‎README.md
Lines changed: 23 additions & 23 deletions b/‎README.md
Lines changed: 23 additions & 23 deletions
diff --git a/‎docs/content/app/_index.md
Lines changed: 0 additions & 38 deletions b/‎docs/content/app/_index.md
Lines changed: 0 additions & 38 deletions
diff --git a/‎docs/content/demo/Append_FLOW_CF.md
Lines changed: 16 additions & 4 deletions b/‎docs/content/demo/Append_FLOW_CF.md
Lines changed: 16 additions & 4 deletions
diff --git a/‎docs/content/demo/Append_FLOW_EH.md
Lines changed: 17 additions & 5 deletions b/‎docs/content/demo/Append_FLOW_EH.md
Lines changed: 17 additions & 5 deletions
diff --git a/‎docs/content/demo/Apply_Changes_From_Snapshot.md
Lines changed: 16 additions & 4 deletions b/‎docs/content/demo/Apply_Changes_From_Snapshot.md
Lines changed: 16 additions & 4 deletions
diff --git a/‎docs/content/demo/DAB.md
Lines changed: 98 additions & 0 deletions b/‎docs/content/demo/DAB.md
Lines changed: 98 additions & 0 deletions
diff --git a/‎docs/content/demo/DAIS.md
Lines changed: 16 additions & 4 deletions b/‎docs/content/demo/DAIS.md
Lines changed: 16 additions & 4 deletions
@@ -29,13 +29,13 @@ In practice, a single generic pipeline reads the Dataflowspec and uses it to orc
 - Capture [Data Quality Rules](https://github.com/databrickslabs/dlt-meta/tree/main/examples/dqe/customers/bronze_data_quality_expectations.json)
 - Capture processing logic as sql in [Silver transformation file](https://github.com/databrickslabs/dlt-meta/blob/main/examples/silver_transformations.json)
 
-#### Generic DLT pipeline
+#### Generic Lakeflow Declarative Pipeline
 
 - Apply appropriate readers based on input metadata
 - Apply data quality rules with DLT expectations
 - Apply CDC apply changes if specified in metadata
-- Builds DLT graph based on input/output metadata
-- Launch DLT pipeline
+- Builds Lakeflow Declarative Pipeline graph based on input/output metadata
+- Launch Lakeflow Declarative Pipeline pipeline
 
 ## High-Level Process Flow:
 
@@ -53,14 +53,15 @@ In practice, a single generic pipeline reads the Dataflowspec and uses it to orc
 | Custom transformations | Bronze, Silver layer accepts custom functions|
 | Data Quality Expecations Support | Bronze, Silver layer |
 | Quarantine table support | Bronze layer |
-| [apply_changes](https://docs.databricks.com/en/delta-live-tables/python-ref.html#cdc) API support | Bronze, Silver layer | 
-| [apply_changes_from_snapshot](https://docs.databricks.com/en/delta-live-tables/python-ref.html#change-data-capture-from-database-snapshots-with-python-in-delta-live-tables) API support | Bronze layer|
+| [create_auto_cdc_flow](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-apply-changes) API support | Bronze, Silver layer | 
+| [create_auto_cdc_from_snapshot_flow](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-apply-changes-from-snapshot) API support | Bronze layer|
 | [append_flow](https://docs.databricks.com/en/delta-live-tables/flows.html#use-append-flow-to-write-to-a-streaming-table-from-multiple-source-streams) API support | Bronze layer|
 | Liquid cluster support | Bronze, Bronze Quarantine, Silver tables|
 | [DLT-META CLI](https://databrickslabs.github.io/dlt-meta/getting_started/dltmeta_cli/) |  ```databricks labs dlt-meta onboard```, ```databricks labs dlt-meta deploy``` |
 | Bronze and Silver pipeline chaining | Deploy dlt-meta pipeline with ```layer=bronze_silver``` option using Direct publishing mode |
-| [DLT Sinks](https://docs.databricks.com/aws/en/delta-live-tables/dlt-sinks) |Supported formats:external ```delta table```, ```kafka```.Bronze, Silver layers|
+| [create_sink](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-sink) API support |Supported formats:```external delta table , kafka``` Bronze, Silver layers|
 | [Databricks Asset Bundles](https://docs.databricks.com/aws/en/dev-tools/bundles/) | Supported
+| [DLT-META UI](https://github.com/databrickslabs/dlt-meta/tree/main/lakehouse_app#dlt-meta-lakehouse-app-setup) | Uses Databricks Lakehouse DLT-META App
 
 ## Getting Started
 
@@ -137,38 +138,37 @@ If you want to run existing demo files please follow these steps before running
     dlt_meta_home=$(pwd)
     export PYTHONPATH=$dlt_meta_home
     ```
+![onboardingDLTMeta.gif](docs/static/images/onboardingDLTMeta.gif)
+
+
 7. Run onboarding command:
     ```commandline
     databricks labs dlt-meta onboard
     ```
-![onboardingDLTMeta.gif](docs/static/images/onboardingDLTMeta.gif)
-
 
-Above commands will prompt you to provide onboarding details. If you have cloned dlt-meta git repo then accept defaults which will launch config from demo folder.
+The command will prompt you to provide onboarding details. If you have cloned the dlt-meta repository, you can accept the default values which will use the configuration from the demo folder.
 ![onboardingDLTMeta_2.gif](docs/static/images/onboardingDLTMeta_2.gif)
 
-
-- Goto your databricks workspace and located onboarding job under: Workflow->Jobs runs
+Above onboard cli command will:
+1. Push code and data to your Databricks workspace
+2. Create an onboarding job
+3. Display a success message: ```Job created successfully. job_id={job_id}, url=https://{databricks workspace url}/jobs/{job_id}```
+4. Job URL will automatically open in your default browser.
 
 ### depoly using dlt-meta CLI:
 
-- Once onboarding jobs is finished deploy `bronze` and `silver` DLT using below command
+- Once onboarding jobs is finished deploy Lakeflow Declarative Pipeline using below command
 - ```commandline
      databricks labs dlt-meta deploy
   ```
-- - Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
-- - Bronze DLT
-
-![deployingDLTMeta_bronze.gif](docs/static/images/deployingDLTMeta_bronze.gif)
+The command will prompt you to provide pipeline configuration details.
 
+![deployingDLTMeta_bronze_silver.gif](docs/static/images/deployingDLTMeta_bronze_silver.gif)
 
-- Silver DLT
-- - ```commandline
-       databricks labs dlt-meta deploy
-    ```
-- - Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
-
-![deployingDLTMeta_silver.gif](docs/static/images/deployingDLTMeta_silver.gif)
+Above deploy cli command will:
+1. Deploy Lakeflow Declarative pipeline with dlt-meta configuration like ```layer```, ```group```, ```dataflowSpec table details``` etc to your databricks workspace
+2. Display message: ```dlt-meta pipeline={pipeline_id} created and launched with update_id={pipeline_update_id}, url=https://{databricks workspace url}/#joblist/pipelines/{pipeline_id}```
+3. Pipline URL will automatically open in your defaul browser.
 
 
 ## More questions
 
@@ -21,15 +21,26 @@ This demo will perform following tasks:
     databricks auth login --host WORKSPACE_HOST
     ```
 
-3. ```commandline
+3. Install Python package requirements:
+    ```commandline
+    # Core requirements
+    pip install "PyYAML>=6.0" setuptools databricks-sdk
+
+    # Development requirements
+    pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
+    ```
+
+4. Clone dlt-meta:
+    ```commandline
     git clone https://github.com/databrickslabs/dlt-meta.git 
     ```
 
-4. ```commandline
+5. Navigate to project directory:
+    ```commandline
     cd dlt-meta
     ```
 
-5. Set python environment variable into terminal
+6. Set python environment variable into terminal
     ```commandline
     dlt_meta_home=$(pwd)
     ```
@@ -38,7 +49,8 @@ This demo will perform following tasks:
     export PYTHONPATH=$dlt_meta_home
     ```
 
-6. ```commandline
+7. Run the command:
+    ```commandline
     python demo/launch_af_cloudfiles_demo.py --cloud_provider_name=aws --dbr_version=15.3.x-scala2.12 --dbfs_path=dbfs:/tmp/DLT-META/demo/ --uc_catalog_name=dlt_meta_uc
     ```
 
 
@@ -18,21 +18,32 @@ draft: false
     databricks auth login --host WORKSPACE_HOST
     ```
 
-3. ```commandline
+3. Install Python package requirements:
+    ```commandline
+    # Core requirements
+    pip install "PyYAML>=6.0" setuptools databricks-sdk
+
+    # Development requirements
+    pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
+    ```
+
+4. Clone dlt-meta:
+    ```commandline
     git clone https://github.com/databrickslabs/dlt-meta.git 
     ```
 
-4. ```commandline
+5. Navigate to project directory:
+    ```commandline
     cd dlt-meta
     ```
-5. Set python environment variable into terminal
+6. Set python environment variable into terminal
     ```commandline
     dlt_meta_home=$(pwd)
     ```
     ```commandline
     export PYTHONPATH=$dlt_meta_home
     ```
-6. Eventhub
+7. Configure Eventhub
 - Needs eventhub instance running
 - Need two eventhub topics first for main feed (eventhub_name) and second for append flow feed (eventhub_name_append_flow)
 - Create databricks secrets scope for eventhub keys
@@ -61,7 +72,8 @@ draft: false
     - eventhub_secrets_scope_name: Databricks secret scope name e.g. eventhubs_dltmeta_creds
     - eventhub_port: Eventhub port
 
-7. ```commandline 
+8. Run the command:
+    ```commandline 
     python demo/launch_af_eventhub_demo.py --cloud_provider_name=aws --uc_catalog_name=dlt_meta_uc --eventhub_name=dltmeta_demo --eventhub_name_append_flow=dltmeta_demo_af --eventhub_secrets_scope_name=dltmeta_eventhub_creds --eventhub_namespace=dltmeta --eventhub_port=9093 --eventhub_producer_accesskey_name=RootManageSharedAccessKey --eventhub_consumer_accesskey_name=RootManageSharedAccessKey --eventhub_accesskey_secret_name=RootManageSharedAccessKey
     ```
 
 
@@ -26,21 +26,33 @@ draft: false
     databricks auth login --host WORKSPACE_HOST
     ```
     
-3. ```commandline
+3. Install Python package requirements:
+    ```commandline
+    # Core requirements
+    pip install "PyYAML>=6.0" setuptools databricks-sdk
+
+    # Development requirements
+    pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
+    ```
+
+4. Clone dlt-meta:
+    ```commandline
     git clone https://github.com/databrickslabs/dlt-meta.git 
     ```
 
-4. ```commandline
+5. Navigate to project directory:
+    ```commandline
     cd dlt-meta
     ```
-5. Set python environment variable into terminal
+6. Set python environment variable into terminal
     ```commandline
     dlt_meta_home=$(pwd)
     ```
     ```commandline
     export PYTHONPATH=$dlt_meta_home
 
-6. ```commandline
+7. Run the command:
+    ```commandline
     python demo/launch_acfs_demo.py --uc_catalog_name=<<uc catalog name>>
     ```
     - uc_catalog_name : Unity catalog name
 
@@ -0,0 +1,98 @@
+---
+title: "DAB Demo"
+date: 2024-02-26T14:25:26-04:00
+weight: 28
+draft: false
+---
+
+### DAB Demo
+
+## Overview
+This demo showcases how to use Databricks Asset Bundles (DABs) with DLT-Meta:
+
+This demo will perform following steps:
+- Create dlt-meta schema's for dataflowspec and bronze/silver layer
+- Upload necessary resources to unity catalog volume
+- Create DAB files with catalog, schema, file locations populated
+- Deploy DAB to databricks workspace
+- Run onboarding using DAB commands
+- Run Bronze/Silver Pipelines using DAB commands
+- Demo examples will showcase fan-out pattern in silver layer
+- Demo example will show case custom transformations for bronze/silver layers
+- Adding custom columns and metadata to Bronze tables
+- Implementing SCD Type 1 to Silver tables
+- Applying expectations to filter data in Silver tables
+
+### Steps:
+1. Launch Command Prompt
+
+2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
+    - Once you install Databricks CLI, authenticate your current machine to a Databricks Workspace:
+    
+    ```commandline
+    databricks auth login --host WORKSPACE_HOST
+    ```
+
+3. Install Python package requirements:
+    ```commandline
+    # Core requirements
+    pip install "PyYAML>=6.0" setuptools databricks-sdk
+
+    # Development requirements
+    pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
+    ```
+
+4. Clone dlt-meta:
+    ```commandline
+    git clone https://github.com/databrickslabs/dlt-meta.git 
+    ```
+
+5. Navigate to project directory:
+    ```commandline
+    cd dlt-meta
+    ```
+
+6. Set python environment variable into terminal:
+    ```commandline
+    dlt_meta_home=$(pwd)
+    export PYTHONPATH=$dlt_meta_home
+    ```
+
+7. Generate DAB resources and set up schemas:
+    This command will:
+    - Generate DAB configuration files
+    - Create DLT-Meta schemas
+    - Upload necessary files to volumes
+    ```commandline
+    python demo/generate_dabs_resources.py --source=cloudfiles --uc_catalog_name=<your_catalog_name> --profile=<your_profile>
+    ```
+    > Note: If you don't specify `--profile`, you'll be prompted for your Databricks workspace URL and access token.
+
+8. Deploy and run the DAB bundle:
+    - Navigate to the DAB directory:
+    ```commandline
+    cd demo/dabs
+    ```
+
+    - Validate the bundle configuration:
+    ```commandline
+    databricks bundle validate --profile=<your_profile>
+    ```
+
+    - Deploy the bundle to dev environment:
+    ```commandline
+    databricks bundle deploy --target dev --profile=<your_profile>
+    ```
+
+    - Run the onboarding job:
+    ```commandline
+    databricks bundle run onboard_people -t dev --profile=<your_profile>
+    ```
+
+    - Execute the pipelines:
+    ```commandline
+    databricks bundle run execute_pipelines_people -t dev --profile=<your_profile>
+    ```
+
+![dab_onboarding_job.png](/images/dab_onboarding_job.png)
+![dab_dlt_pipelines.png](/images/dab_dlt_pipelines.png)
@@ -23,23 +23,35 @@ This demo showcases DLT-META's capabilities of creating Bronze and Silver DLT pi
     databricks auth login --host WORKSPACE_HOST
     ```
 
-3. ```commandline
+3. Install Python package requirements:
+    ```commandline
+    # Core requirements
+    pip install "PyYAML>=6.0" setuptools databricks-sdk
+
+    # Development requirements
+    pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
+    ```
+
+4. Clone dlt-meta:
+    ```commandline
     git clone https://github.com/databrickslabs/dlt-meta.git 
     ```
 
-4. ```commandline
+5. Navigate to project directory:
+    ```commandline
     cd dlt-meta
     ```
 
-5. Set python environment variable into terminal
+6. Set python environment variable into terminal
     ```commandline
     dlt_meta_home=$(pwd)
     ```
     ```commandline
     export PYTHONPATH=$dlt_meta_home
     ```
 
-6. ```commandline 
+7. Run the command:
+    ```commandline 
     python demo/launch_dais_demo.py --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=<<>>
     ```
     - uc_catalog_name : unit catalog name