dbt-labs
diff --git a/‎website/docs/docs/build/data-tests.md‎
Lines changed: 20 additions & 20 deletions b/‎website/docs/docs/build/data-tests.md‎
Lines changed: 20 additions & 20 deletions
diff --git a/‎website/docs/docs/build/unit-tests.md‎
Lines changed: 64 additions & 14 deletions b/‎website/docs/docs/build/unit-tests.md‎
Lines changed: 64 additions & 14 deletions
@@ -32,14 +32,14 @@ Data tests are assertions you make about your models and other resources in your
 
 You can use data tests to improve the integrity of the SQL in each model by making assertions about the results generated. Out of the box, you can test whether a specified column in a model only contains non-null values, unique values, or values that have a corresponding value in another model (for example, a `customer_id` for an `order` corresponds to an `id` in the `customers` model), and values from a specified list. You can extend data tests to suit business logic specific to your organization – any assertion that you can make about your model in the form of a select query can be turned into a data test.
 
-Data tests return a set of failing records. Generic data tests (a.k.a. schema tests) are defined using `test` blocks.
+Data tests return a set of failing records. Generic data tests (also known as schema tests) are defined using `test` blocks.
 
-Like almost everything in dbt, data tests are SQL queries. In particular, they are `select` statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks after nulls. If the data test returns zero failing rows, it passes, and your assertion has been validated.
+Like almost everything in dbt, data tests are SQL queries. In particular, they are `select` statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks nulls. If the data test returns zero failing rows, it passes, and your assertion has been validated.
 
 There are two ways of defining data tests in dbt:
 
-- A **singular** data test is testing in its simplest form: If you can write a SQL query that returns failing rows, you can save that query in a `.sql` file within your [test directory](/reference/project-configs/test-paths). It's now a data test, and it will be executed by the `dbt test` command.
-- A **generic** data test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic data tests built in, and we think you should use them!
+- A **singular** data test, in its simplest form, is when you write a SQL query that returns failing rows, you can save that query in a `.sql` file within your [test directory](/reference/project-configs/test-paths). It's now a data test, and it will be executed by the `dbt test` command.
+- A **generic** data test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](/docs/build/jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic data tests built in, and we think you should use them!
 
 Defining data tests is a great way to confirm that your outputs and inputs are as expected, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic data tests tend to be much more common—they should make up the bulk of your dbt data testing suite. That said, both ways of defining data tests have their time and place.
 
@@ -51,7 +51,7 @@ If you're new to dbt, we recommend that you check out our [online dbt Fundamenta
 
 The simplest way to define a data test is by writing the exact SQL that will return failing records. We call these "singular" data tests, because they're one-off assertions usable for a single purpose.
 
-These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your [`test-paths` config](/reference/project-configs/test-paths)). You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one data test:
+These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your `test-paths` config). **Note:** The `tests/` directory (`test-paths`) is reserved for singular and generic data tests (SQL). Unit test YAML definitions must live under your project’s `model-paths` (for example, in the `models/` directory), not in `tests/`. You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one data test:
 
 <File name='tests/assert_total_payment_amount_is_positive.sql'>
 
@@ -68,7 +68,7 @@ having total_amount < 0
 
 </File>
 
-The name of this test is the name of the file: `assert_total_payment_amount_is_positive`. 
+The test name is the file name: `assert_total_payment_amount_is_positive`. 
 
 Note:
 - Omit semicolons (;) at the end of the SQL statement in your singular test files, as they can cause your data test to fail.
@@ -92,7 +92,7 @@ data_tests:
 Singular data tests are so easy that you may find yourself writing the same basic structure repeatedly, only changing the name of a column or model. By that point, the test isn't so singular! In that case, we recommend generic data tests.
 
 ## Generic data tests
-Certain data tests are generic: they can be reused over and over again. A generic data test is defined in a `test` block, which contains a parametrized query and accepts arguments. It might look like:
+Certain data tests are generic: they can be reused over and over again. A generic data test is defined in a `test` block, which contains a parameterized query and accepts arguments. It might look like:
 
 ```sql
 {% test not_null(model, column_name) %}
@@ -104,15 +104,15 @@ Certain data tests are generic: they can be reused over and over again. A generi
 {% endtest %}
 ```
 
-You'll notice that there are two arguments, `model` and `column_name`, which are then templated into the query. This is what makes the data test "generic": it can be defined on as many columns as you like, across as many models as you like, and dbt will pass the values of `model` and `column_name` accordingly. Once that generic test has been defined, it can be added as a _property_ on any existing model (or source, seed, or snapshot). These properties are added in  `.yml` files in the same directory as your resource.
+You'll notice that there are two arguments, `model` and `column_name`, which are then templated into the query. This is what makes the data test "generic": it can be defined on as many columns as you like, across as many models as you like, and dbt will pass the values of `model` and `column_name` accordingly. Once that generic test has been defined, it can be added as a _property_ on any existing model (or source, seed, or snapshot). These properties are added in `.yml` files in the same directory as your resource.
 
 :::info
 If this is your first time working with adding properties to a resource, check out the docs on [declaring properties](/reference/configs-and-properties).
 :::
 
-Out of the box, dbt ships with four generic data tests already defined: `unique`, `not_null`, `accepted_values` and `relationships`. Here's a full example using those tests on an `orders` model:
+Out of the box, dbt ships with four generic data tests already defined: `unique`, `not_null`, `accepted_values`, and `relationships`. Here's a full example using those tests on an `orders` model:
 
-```yml
+```yaml
 
 models:
   - name: orders
@@ -137,10 +137,10 @@ models:
 In plain English, these data tests translate to:
 * `unique`: the `order_id` column in the `orders` model should be unique
 * `not_null`: the `order_id` column in the `orders` model should not contain null values
-* `accepted_values`: the `status` column in the `orders` should be  one of `'placed'`, `'shipped'`, `'completed'`, or  `'returned'`
+* `accepted_values`: the `status` column in the `orders` model should be one of `'placed'`, `'shipped'`, `'completed'`, or `'returned'`
 * `relationships`: each `customer_id` in the `orders` model exists as an `id` in the `customers` <Term id="table" /> (also known as referential integrity)
 
-Behind the scenes, dbt constructs a `select` query for each data test, using the parametrized query from the generic test block. These queries return the rows where your assertion is _not_ true; if the test returns zero rows, your assertion passes.
+Behind the scenes, dbt constructs a `select` query for each data test, using the parameterized query from the generic test block. These queries return the rows where your assertion is _not_ true; if the test returns zero rows, your assertion passes.
 
 You can find more information about these data tests, and additional configurations (including [`severity`](/reference/resource-configs/severity) and [`tags`](/reference/resource-configs/tags)) in the [reference section](/reference/resource-properties/data-tests). You can also add descriptions to the Jinja macro that provides the core logic of a generic data test. Refer to the [Add description to generic data test logic](/best-practices/writing-custom-generic-tests#add-description-to-generic-data-test-logic) for more information.
 
@@ -274,27 +274,27 @@ where {{ column_name }} is null
 
 ## Storing data test failures
 
-Normally, a data test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag,  the [`store_failures`](/reference/resource-configs/store_failures), or the [`store_failures_as`](/reference/resource-configs/store_failures_as) configs, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures.
+Normally, a data test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag, the [`store_failures`](/reference/resource-configs/store_failures), or the [`store_failures_as`](/reference/resource-configs/store_failures_as) configs, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures.
 
 This workflow allows you to query and examine failing records much more quickly in development:
 
 <Lightbox src="/img/docs/building-a-dbt-project/test-store-failures.gif" title="Store test failures in the database for faster development-time debugging."/>
 
-Note that, if you select to store data test failures:
-* Test result tables are created in a schema suffixed or named `dbt_test__audit`, by default. It is possible to change this value by setting a `schema` config. (For more details on schema naming, see [using custom schemas](/docs/build/custom-schemas).)
+Note that, if you choose to store data test failures:
+- Test result tables are created in a schema suffixed or named `dbt_test__audit`, by default. It is possible to change this value by setting a `schema` config. (For more details on schema naming, see [using custom schemas](/docs/build/custom-schemas).)
 - A test's results will always **replace** previous failures for the same test.
 
 
 
 ## New `data_tests:` syntax
-  
-Data tests were historically called "tests" in dbt as the only form of testing available. With the introduction of unit tests, the key was renamed from `tests:` to `data_tests:`. 
 
-dbt still supports `tests:` in your YML configuration files for backwards-compatibility purposes, and you might see it used throughout our documentation. However, you can't have a `tests` and a `data_tests` key associated with the same resource (for example, a single model) at the same time.
+Data tests were historically called "tests" in dbt as the only form of testing available. With the introduction of unit tests, the key was renamed from `tests:` to `data_tests:`.
+
+dbt still supports `tests:` in your YAML configuration files for backward-compatibility purposes, and you might see it used throughout our documentation. However, you can't have a `tests` and a `data_tests` key associated with the same resource (for example, a single model) at the same time.
 
 <File name='models/schema.yml'>
 
-```yml
+```yaml
 models:
   - name: orders
     columns:
@@ -308,7 +308,7 @@ models:
 
 <File name='dbt_project.yml'>
 
-```yml
+```yaml
 data_tests:
   +store_failures: true
 ```
 
@@ -12,19 +12,11 @@ keywords:
 
 Historically, dbt's test coverage was confined to [“data” tests](/docs/build/data-tests), assessing the quality of input data or resulting datasets' structure. However, these tests could only be executed _after_ building a model. 
 
-There is an additional type of test to dbt - unit tests. In software programming, unit tests validate small portions of your functional code, and they work much the same way here. Unit tests allow you to validate your SQL modeling logic on a small set of static inputs _before_ you materialize your full model in production. Unit tests enable test-driven development, benefiting developer efficiency and code reliability. 
+There is an additional type of test in dbt: unit tests. In software programming, unit tests validate small portions of your functional code, and they work much the same way here. Unit tests allow you to validate your SQL modeling logic on a small set of static inputs _before_ you materialize your full model in production. Unit tests enable test-driven development, benefiting developer efficiency and code reliability. 
 
-## Before you begin
+import UnitTestsPrereqs from '/snippets/_unit-tests-prereqs.md';
 
-- We currently only support unit testing SQL models.
-- We currently only support adding unit tests to models in your _current_ project.
-- We currently _don't_ support unit testing models that use the [`materialized view`](/docs/build/materializations#materialized-view) materialization.
-- We currently _don't_ support unit testing models that use recursive SQL.
-- We currently _don't_ support unit testing models that use introspective queries.
-- If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](/reference/resource-properties/unit-testing-versions) for more information.
-- Unit tests must be defined in a YML file in your [`models/` directory](/reference/project-configs/model-paths).
-- Table names must be aliased in order to unit test `join` logic.
-- Include all [`ref`](/reference/dbt-jinja-functions/ref) or [`source`](/reference/dbt-jinja-functions/source) model references in the unit test configuration as `input`s to avoid "node not found" errors during compilation.
+<UnitTestsPrereqs />
 
 #### Adapter-specific caveats
 - You must specify all fields in a BigQuery `STRUCT` in a unit test. You cannot use only a subset of fields in a `STRUCT`.
@@ -112,16 +104,19 @@ unit_tests:
     model: dim_customers
     given:
       - input: ref('stg_customers')
+        format: dict
         rows:
           - {email: cool@example.com,    email_top_level_domain: example.com}
           - {email: cool@unknown.com,    email_top_level_domain: unknown.com}
           - {email: badgmail.com,        email_top_level_domain: gmail.com}
           - {email: missingdot@gmailcom, email_top_level_domain: gmail.com}
       - input: ref('top_level_email_domains')
+        format: dict
         rows:
           - {tld: example.com}
           - {tld: gmail.com}
     expect:
+      format: dict
       rows:
         - {email: cool@example.com,    is_valid_email_address: true}
         - {email: cool@unknown.com,    is_valid_email_address: false}
@@ -133,6 +128,61 @@ unit_tests:
 
 The previous example defines the mock data using the inline `dict` format, but you can also use `csv` or `sql` either inline or in a separate fixture file. Store your fixture files in a `fixtures` subdirectory in any of your [test paths](/reference/project-configs/test-paths). For example, `tests/fixtures/my_unit_test_fixture.sql`. 
 
+The following examples show how to define mock data and expected output using `csv` and `sql`.
+
+<File name='models/schema.yml'>
+
+```yaml
+unit_tests:
+  - name: test_is_valid_email_address__csv
+    model: dim_customers
+    given:
+      - input: ref('stg_customers')
+        format: dict
+        rows:
+          - {email: cool@example.com,    email_top_level_domain: example.com}
+          - {email: cool@unknown.com,    email_top_level_domain: unknown.com}
+          - {email: badgmail.com,        email_top_level_domain: gmail.com}
+          - {email: missingdot@gmailcom, email_top_level_domain: gmail.com}
+      - input: ref('top_level_email_domains')
+        format: csv
+        rows: |
+          tld
+          example.com
+          gmail.com
+    expect:
+      format: csv
+      fixture: valid_email_address_fixture_output
+```
+
+</File>
+
+<File name='models/schema.yml'>
+
+```yaml
+unit_tests:
+  - name: test_is_valid_email_address__sql
+    model: dim_customers
+    given:
+      - input: ref('stg_customers')
+        format: dict
+        rows:
+          - {email: cool@example.com,    email_top_level_domain: example.com}
+          - {email: cool@unknown.com,    email_top_level_domain: unknown.com}
+          - {email: badgmail.com,        email_top_level_domain: gmail.com}
+          - {email: missingdot@gmailcom, email_top_level_domain: gmail.com}
+      - input: ref('top_level_email_domains')
+        format: sql
+        rows: |
+          select 'example.com' as tld union all
+          select 'gmail.com' as tld
+    expect:
+      format: sql
+      fixture: valid_email_address_fixture_output
+```
+
+</File>
+
 When using the `dict` or `csv` format, you only have to define the mock data for the columns relevant to you. This enables you to write succinct and _specific_ unit tests.
 
 :::note
@@ -226,7 +276,7 @@ Your model is now ready for production! Adding this unit test helped catch an is
 When configuring your unit test, you can override the output of macros, vars, or environment variables. This enables you to unit test your incremental models in "full refresh" and "incremental" modes.
 
 :::note
-Incremental models need to exist in the database first before running unit tests or doing a `dbt build`. Use the [`--empty` flag](/reference/commands/build#the---empty-flag) to build an empty version of the models to save warehouse spend. You can also optionally select only your incremental models using the [`--select` flag](/reference/node-selection/syntax#shorthand).
+Incremental models need to exist in the database before running unit tests or doing a `dbt build`. Use the [`--empty` flag](/reference/commands/build#the---empty-flag) to build an empty version of the models to save warehouse spend. You can also optionally select only your incremental models using the [`--select` flag](/reference/node-selection/syntax#shorthand).
 
   ```shell
   dbt run --select "config.materialized:incremental" --empty
@@ -260,7 +310,7 @@ where event_time > (select max(event_time) from {{ this }})
 
 You can define unit tests on `my_incremental_model` to ensure your incremental logic is working as expected:
 
-```yml
+```yaml
 
 unit_tests:
   - name: my_incremental_model_full_refresh_mode
@@ -307,7 +357,7 @@ There is currently no way to unit test whether the dbt framework inserted/merged
 
 If you want to unit test a model that depends on an ephemeral model, you must use `format: sql` for that input.
 
-```yml
+```yaml
 unit_tests:
   - name: my_unit_test
     model: dim_customers