Skip to content

Commit 387abbb

Browse files
Merge pull request #207 from databrickslabs/Issue_22
Issue 22
2 parents ebab278 + 0cd4126 commit 387abbb

File tree

9 files changed

+215
-97
lines changed

9 files changed

+215
-97
lines changed

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,5 +156,4 @@ demo/conf/onboarding.json
156156
integration_tests/conf/onboarding*.json
157157
demo/conf/onboarding*.json
158158
integration_test_output*.csv
159-
databricks.yml
160159
onboarding_job_details.json

README.md

Lines changed: 30 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ In practice, a single generic pipeline reads the Dataflowspec and uses it to orc
6060
| [DLT-META CLI](https://databrickslabs.github.io/dlt-meta/getting_started/dltmeta_cli/) | ```databricks labs dlt-meta onboard```, ```databricks labs dlt-meta deploy``` |
6161
| Bronze and Silver pipeline chaining | Deploy dlt-meta pipeline with ```layer=bronze_silver``` option using Direct publishing mode |
6262
| [DLT Sinks](https://docs.databricks.com/aws/en/delta-live-tables/dlt-sinks) |Supported formats:external ```delta table```, ```kafka```.Bronze, Silver layers|
63+
| [Databricks Asset Bundles](https://docs.databricks.com/aws/en/dev-tools/bundles/) | Supported
6364

6465
## Getting Started
6566

@@ -99,36 +100,47 @@ databricks auth login --host WORKSPACE_HOST
99100

100101
If you want to run existing demo files please follow these steps before running onboard command:
101102

102-
```commandline
103+
1. Clone dlt-meta:
104+
```commandline
103105
git clone https://github.com/databrickslabs/dlt-meta.git
104-
```
106+
```
105107
106-
```commandline
108+
2. Navigate to project directory:
109+
```commandline
107110
cd dlt-meta
108-
```
111+
```
109112
110-
```commandline
113+
3. Create Python virtual environment:
114+
```commandline
111115
python -m venv .venv
112-
```
116+
```
113117
114-
```commandline
118+
4. Activate virtual environment:
119+
```commandline
115120
source .venv/bin/activate
116-
```
121+
```
117122
118-
```commandline
119-
pip install databricks-sdk
120-
```
123+
5. Install required packages:
124+
```commandline
125+
# Core requirements
126+
pip install "PyYAML>=6.0" setuptools databricks-sdk
127+
128+
# Development requirements
129+
pip install delta-spark==3.0.0 pyspark==3.5.5 pytest>=7.0.0 coverage>=7.0.0
130+
131+
# Integration test requirements
132+
pip install "typer[all]==0.6.1"
133+
```
121134
122-
```commandline
135+
6. Set environment variables:
136+
```commandline
123137
dlt_meta_home=$(pwd)
124-
```
125-
126-
```commandline
127138
export PYTHONPATH=$dlt_meta_home
128-
```
129-
```commandline
139+
```
140+
7. Run onboarding command:
141+
```commandline
130142
databricks labs dlt-meta onboard
131-
```
143+
```
132144
![onboardingDLTMeta.gif](docs/static/images/onboardingDLTMeta.gif)
133145
134146

demo/README.md

Lines changed: 92 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -21,15 +21,22 @@ This Demo launches Bronze and Silver pipelines with following activities:
2121

2222
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
2323

24-
3. ```commandline
24+
3. Install Python package requirements:
25+
```commandline
26+
pip install "PyYAML>=6.0" setuptools databricks-sdk
27+
pip install delta-spark==3.0.0 pyspark==3.5.5
28+
```
29+
30+
4. Clone dlt-meta:
31+
```commandline
2532
git clone https://github.com/databrickslabs/dlt-meta.git
2633
```
2734
28-
4. ```commandline
35+
5. ```commandline
2936
cd dlt-meta
3037
```
3138
32-
5. Set python environment variable into terminal
39+
6. Set python environment variable into terminal
3340
```commandline
3441
dlt_meta_home=$(pwd)
3542
```
@@ -38,7 +45,7 @@ This Demo launches Bronze and Silver pipelines with following activities:
3845
export PYTHONPATH=$dlt_meta_home
3946
```
4047
41-
6. ```commandline
48+
7. ```commandline
4249
python demo/launch_dais_demo.py --uc_catalog_name=<<uc catalog name>> --profile=<<DEFAULT>>
4350
```
4451
- uc_catalog_name : Unity catalog name
@@ -53,15 +60,21 @@ This demo will launch auto generated tables(100s) inside single bronze and silve
5360
5461
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
5562
56-
3. ```commandline
63+
3. Install Python package requirements:
64+
```commandline
65+
pip install "PyYAML>=6.0" setuptools databricks-sdk
66+
pip install delta-spark==3.0.0 pyspark==3.5.5
67+
```
68+
69+
4. ```commandline
5770
git clone https://github.com/databrickslabs/dlt-meta.git
5871
```
5972
60-
4. ```commandline
73+
5. ```commandline
6174
cd dlt-meta
6275
```
6376
64-
5. Set python environment variable into terminal
77+
6. Set python environment variable into terminal
6578
```commandline
6679
dlt_meta_home=$(pwd)
6780
```
@@ -70,7 +83,7 @@ This demo will launch auto generated tables(100s) inside single bronze and silve
7083
export PYTHONPATH=$dlt_meta_home
7184
```
7285
73-
6. ```commandline
86+
7. ```commandline
7487
python demo/launch_techsummit_demo.py --uc_catalog_name=<<uc catalog name>> --profile=<<DEFAULT>>
7588
```
7689
- uc_catalog_name : Unity catalog name
@@ -89,15 +102,21 @@ This demo will perform following tasks:
89102
90103
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
91104
92-
3. ```commandline
105+
3. Install Python package requirements:
106+
```commandline
107+
pip install "PyYAML>=6.0" setuptools databricks-sdk
108+
pip install delta-spark==3.0.0 pyspark==3.5.5
109+
```
110+
111+
4. ```commandline
93112
git clone https://github.com/databrickslabs/dlt-meta.git
94113
```
95114
96-
4. ```commandline
115+
5. ```commandline
97116
cd dlt-meta
98117
```
99118
100-
5. Set python environment variable into terminal
119+
6. Set python environment variable into terminal
101120
```commandline
102121
dlt_meta_home=$(pwd)
103122
```
@@ -106,7 +125,7 @@ This demo will perform following tasks:
106125
export PYTHONPATH=$dlt_meta_home
107126
```
108127
109-
6. ```commandline
128+
7. ```commandline
110129
python demo/launch_af_cloudfiles_demo.py --uc_catalog_name=<<uc catalog name>> --source=cloudfiles --profile=<<DEFAULT>>
111130
```
112131
- uc_catalog_name : Unity Catalog name
@@ -122,14 +141,20 @@ This demo will perform following tasks:
122141
123142
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
124143
125-
3. ```commandline
144+
3. Install Python package requirements:
145+
```commandline
146+
pip install "PyYAML>=6.0" setuptools databricks-sdk
147+
pip install delta-spark==3.0.0 pyspark==3.5.5
148+
```
149+
150+
4. ```commandline
126151
git clone https://github.com/databrickslabs/dlt-meta.git
127152
```
128153
129-
4. ```commandline
154+
5. ```commandline
130155
cd dlt-meta
131156
```
132-
5. Set python environment variable into terminal
157+
6. Set python environment variable into terminal
133158
```commandline
134159
dlt_meta_home=$(pwd)
135160
```
@@ -181,14 +206,20 @@ This demo will perform following tasks:
181206
182207
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
183208
184-
3. ```commandline
209+
3. Install Python package requirements:
210+
```commandline
211+
pip install "PyYAML>=6.0" setuptools databricks-sdk
212+
pip install delta-spark==3.0.0 pyspark==3.5.5
213+
```
214+
215+
4. ```commandline
185216
git clone https://github.com/databrickslabs/dlt-meta.git
186217
```
187218
188-
4. ```commandline
219+
5. ```commandline
189220
cd dlt-meta
190221
```
191-
5. Set python environment variable into terminal
222+
6. Set python environment variable into terminal
192223
```commandline
193224
dlt_meta_home=$(pwd)
194225
```
@@ -198,15 +229,15 @@ This demo will perform following tasks:
198229
199230
6. Run the command
200231
```commandline
201-
python demo/launch_silver_fanout_demo.py --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --profile=<<DEFAULT>>
232+
python demo/launch_silver_fanout_demo.py --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --profile=<<DEFAULT>>
202233
```
203234
204235
- you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token.
205236
206-
- - 6a. Databricks Workspace URL:
207-
- - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
237+
a. Databricks Workspace URL:
238+
Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
208239
209-
- - 6b. Token:
240+
b. Token:
210241
- In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
211242
212243
- On the Access tokens tab, click Generate new token.
@@ -241,14 +272,20 @@ This demo will perform following tasks:
241272
242273
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
243274
244-
3. ```commandline
275+
3. Install Python package requirements:
276+
```commandline
277+
pip install "PyYAML>=6.0" setuptools databricks-sdk
278+
pip install delta-spark==3.0.0 pyspark==3.5.5
279+
```
280+
281+
4. ```commandline
245282
git clone https://github.com/databrickslabs/dlt-meta.git
246283
```
247284
248-
4. ```commandline
285+
5. ```commandline
249286
cd dlt-meta
250287
```
251-
5. Set python environment variable into terminal
288+
6. Set python environment variable into terminal
252289
```commandline
253290
dlt_meta_home=$(pwd)
254291
```
@@ -276,14 +313,20 @@ This demo will perform following tasks:
276313
277314
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
278315
279-
3. ```commandline
316+
3. Install Python package requirements:
317+
```commandline
318+
pip install "PyYAML>=6.0" setuptools databricks-sdk
319+
pip install delta-spark==3.0.0 pyspark==3.5.5
320+
```
321+
322+
4. ```commandline
280323
git clone https://github.com/databrickslabs/dlt-meta.git
281324
```
282325
283-
4. ```commandline
326+
5. ```commandline
284327
cd dlt-meta
285328
```
286-
5. Set python environment variable into terminal
329+
6. Set python environment variable into terminal
287330
```commandline
288331
dlt_meta_home=$(pwd)
289332
```
@@ -316,32 +359,38 @@ This demo will perform following tasks:
316359
317360
## Overview
318361
This demo showcases how to use Databricks Asset Bundles (DABs) with DLT-Meta:
319-
* This demo will perform following steps
320-
* * Create dlt-meta schema's for dataflowspec and bronze/silver layer
321-
* * Upload nccessary resources to unity catalog volume
322-
* * Create DAB files with catalog, schema, file locations populated
323-
* * Deploy DAB to databricks workspace
324-
* * Run onboarding usind DAB commands
325-
* * Run Bronze/Silver Pipelines using DAB commands
326-
* * Demo examples will showcase fan-out pattern in silver layer
327-
* * Demo example will show case custom transfomations for bronze/silver layers
328-
* * Adding custom columns and metadata to Bronze tables
329-
* * Implementing SCD Type 1 to Silver tables
330-
* * Applying expectations to filter data in Silver tables
362+
This demo will perform following steps:
363+
- Create dlt-meta schema's for dataflowspec and bronze/silver layer
364+
- Upload nccessary resources to unity catalog volume
365+
- Create DAB files with catalog, schema, file locations populated
366+
- Deploy DAB to databricks workspace
367+
- Run onboarding usind DAB commands
368+
- Run Bronze/Silver Pipelines using DAB commands
369+
- Demo examples will showcase fan-out pattern in silver layer
370+
- Demo example will show case custom transfomations for bronze/silver layers
371+
- Adding custom columns and metadata to Bronze tables
372+
- Implementing SCD Type 1 to Silver tables
373+
- Applying expectations to filter data in Silver tables
331374
332375
### Steps:
333376
1. Launch Command Prompt
334377
335378
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
336379
337-
3. ```commandline
380+
3. Install Python package requirements:
381+
```commandline
382+
pip install "PyYAML>=6.0" setuptools databricks-sdk
383+
pip install delta-spark==3.0.0 pyspark==3.5.5
384+
```
385+
386+
4. ```commandline
338387
git clone https://github.com/databrickslabs/dlt-meta.git
339388
```
340389
341-
4. ```commandline
390+
5. ```commandline
342391
cd dlt-meta
343392
```
344-
5. Set python environment variable into terminal
393+
6. Set python environment variable into terminal
345394
```commandline
346395
dlt_meta_home=$(pwd)
347396
```

demo/dabs/databricks.yml

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# This is a Databricks asset bundle definition for dbx_dab_dlt_meta_demo_aug_2025.
2+
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
3+
bundle:
4+
name: dab_dlt_meta_demo
5+
uuid: fb711a7b-ceb7-4054-a49c-cf1d53702692
6+
7+
include:
8+
- resources/*.yml
9+
sync:
10+
exclude:
11+
- .gitignore
12+
- .DS_Store
13+
- .vscode
14+
- bundle_config_schema.json
15+
- .venv/
16+
- resources/jobs.template
17+
18+
targets:
19+
dev:
20+
# The default target uses 'mode: development' to create a development copy.
21+
# - Deployed resources get prefixed with '[dev my_user_name]'
22+
# - Any job schedules and triggers are paused by default.
23+
# See also https://docs.databricks.com/dev-tools/bundles/deployment-modes.html.
24+
mode: development
25+
default: true
26+
#SET DATABRICKS_HOST=<your-databricks-workspace-url>
27+
# workspace:
28+
# root_path: /Workspace/Shared/${bundle.name}/demo/${bundle.target}
29+
30+
prod:
31+
mode: production
32+
workspace:
33+
#SET DATABRICKS_HOST=<your-databricks-workspace-url>
34+
# host: https://
35+
# We explicitly deploy to /Workspace/Users/${bundle.name} to make sure we only have a single copy.
36+
root_path: /Workspace/Shared/${bundle.name}/demo/${bundle.target}
37+
# Remember to set the correct group name for production deployments.
38+
permissions:
39+
- group_name: users
40+
level: CAN_MANAGE
41+
# - user_name: service_principal_id
42+
# level: CAN_MANAGE

0 commit comments

Comments
 (0)