Skip to content

Conversation

tinglin-db
Copy link

@tinglin-db tinglin-db commented Sep 11, 2025

Changes

To support least privileged workspaces on GCP, adding a new field expected_workspace_status, which will be translated to workspace_state in the API request.

Tests

Added a unit test TestResourceWorkspaceCreateGcpWithExpectedProvisioning and an integration test TestMwsAccGcpWorkspacesWithExpectedProvisioning

  • make test run locally
  • using Go SDK
  • has entry in NEXT_CHANGELOG.md file

Tested E2E Locally

Terraform will perform the following actions:

  # databricks_mws_workspaces.this will be created
  + resource "databricks_mws_workspaces" "this" {
      + account_id                = (sensitive value)
      + cloud                     = (known after apply)
      + creation_time             = (known after apply)
      + effective_compute_mode    = (known after apply)
      + expected_workspace_status = "PROVISIONING"
      + gcp_workspace_sa          = (known after apply)
      + id                        = (known after apply)
      + is_no_public_ip_enabled   = true
      + location                  = "us-central1"
      + pricing_tier              = (known after apply)
      + workspace_id              = (known after apply)
      + workspace_name            = "tlin-classic-test-tf-1"
      + workspace_status          = (known after apply)
      + workspace_status_message  = (known after apply)
      + workspace_url             = (known after apply)

      + cloud_resource_container {
          + gcp {
              + project_id = "databricks-cal-dev-testing"
            }
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

databricks_mws_workspaces.this: Creating...
databricks_mws_workspaces.this: Creation complete after 2s [id=9fcbb245-7c44-4522-9870-e38324104cf8/2181571221551671]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

  # databricks_mws_workspaces.this will be updated in-place
  ~ resource "databricks_mws_workspaces" "this" {
      - expected_workspace_status = "PROVISIONING" -> null
        id                        = "9fcbb245-7c44-4522-9870-e38324104cf8/2181571221551671"
      + network_id                = "a124db3a-0928-4698-97eb-70e751c87934"
        # (14 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

databricks_mws_workspaces.this: Modifying... [id=9fcbb245-7c44-4522-9870-e38324104cf8/2181571221551671]
databricks_mws_workspaces.this: Modifications complete after 6s [id=9fcbb245-7c44-4522-9870-e38324104cf8/2181571221551671]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

@tinglin-db tinglin-db requested review from a team as code owners September 11, 2025 22:08
@tinglin-db tinglin-db requested review from tanmay-db and removed request for a team September 11, 2025 22:08
@tinglin-db tinglin-db changed the title Add expected_workspace_status in databricks_mws_workspaces to support LPW [Draft] Add expected_workspace_status in databricks_mws_workspaces to support LPW Sep 11, 2025
Copy link
Contributor

@nkvuong nkvuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments

@alexott
Copy link
Contributor

alexott commented Sep 12, 2025

please run make fmt ws

@tinglin-db
Copy link
Author

jenkins trigger all

})
}

func TestMwsAccGcpWorkspacesWithExpectedProvisioning(t *testing.T) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test run (link)
On its last run, this integration test failed during the post-test destroy due to not being able to delete a still PROVISIONING workspace. Error message:

Error: -26T17:50:17.498Z [ERROR] sdk.helper_resource: Error running post-test destroy, there may be dangling resources: test_terraform_path=/tmp/plugintest-terraform1770350163/terraform test_step_number=1
  | Error: cannot delete mws workspaces: INVALID_STATE: workspace 49063535405170 cannot be deleted because it is in status PROVISIONING but not must be in one of: PROVISIONING

I pushed a new commit to catch INVALID_STATE_TRANSITION errors in DELETE, but this is probably not the best approach - not sure if there is a way to have the integration test instead run through both CREATE and UPDATE so that the workspace is in a RUNNING status? cc @panchenxue-databricks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to support this case of deleting workspaces in PROVISIONING, since users could attempt to delete a partially created workspace, or they could modify an attribute that requires recreation (like deployment name).

For this test, you could also just add an additional step to put the workspace in a running state by updating expected_workspace_status.

@alexott
Copy link
Contributor

alexott commented Sep 27, 2025

@tinglin-db Now it fails with:

Error: cannot create mws workspaces: Workspace limit exceeded, only 20 workspaces are allowed for your account

that's why I think that deletion in the provisioning state should be handled on backend, otherwise there will be dangling resources.

WorkspaceURL string `json:"workspace_url,omitempty" tf:"computed"`
WorkspaceStatus string `json:"workspace_status,omitempty" tf:"computed"`
WorkspaceStatusMessage string `json:"workspace_status_message,omitempty" tf:"computed"`
ExpectedWorkspaceStatus string `json:"expected_workspace_status,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also document this in docs/resources/mws_workspaces.md.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true even if the feature is still in private preview?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Private preview services, methods and fields (and their documentation) are still part of the SDKs. Currently, private preview methods are not shown in the REST API documentation. For example: https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/feature_engineering_feature

I would like us to add badges on individual fields if their stability level lags behind the resource level, but we haven't gotten to that yet. In the meantime, we can add a note in the documentation that this feature is in private preview.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation, let me know if the wording works

Comment on lines 687 to 692
if expectedStatus == WorkspaceStatusProvisioning {
log.Printf("[INFO] Expected status is PROVISIONING, skipping wait in read function")
err = nil
} else {
err = workspacesAPI.WaitForExpectedStatus(workspace, d.Timeout(schema.TimeoutRead))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have one place for logic around waiting for workspace status. I believe this is unnecessary because the implementation of WaitForExpectedStatus already handles this properly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified WaitForExpectedStatus to take in the expected_status. We need to pass it in from d.Get("expected_workspace_status") because the field is not returned by the GET during polling

… logic, removed deletion skip, removed read skip
Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligned on the pathway forward, let's wrap this up.

return err
}
return a.WaitForRunning(ws, timeout)
return a.WaitForExpectedStatus(ws, WorkspaceStatusRunning, timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set the expected workspace status that is provided by the user in this request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

})
}

func TestMwsAccGcpWorkspacesWithExpectedProvisioning(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, two more test cases:

  1. New workspace in provisioning state -> set expected state to RUNNING
  2. New workspace in provisioning state -> unset expected state, workspace should progress to RUNNING.

Copy link
Author

@tinglin-db tinglin-db Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, added these test cases

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close, some nits but this is basically looking good to me.

WorkspaceURL string `json:"workspace_url,omitempty" tf:"computed"`
WorkspaceStatus string `json:"workspace_status,omitempty" tf:"computed"`
WorkspaceStatusMessage string `json:"workspace_status_message,omitempty" tf:"computed"`
ExpectedWorkspaceStatus string `json:"expected_workspace_status,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Private preview services, methods and fields (and their documentation) are still part of the SDKs. Currently, private preview methods are not shown in the REST API documentation. For example: https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/feature_engineering_feature

I would like us to add badges on individual fields if their stability level lags behind the resource level, but we haven't gotten to that yet. In the meantime, we can add a note in the documentation that this feature is in private preview.

}

var workspaceRunningUpdatesAllowed = []string{"credentials_id", "network_id", "storage_customer_managed_key_id", "private_access_settings_id", "managed_services_customer_managed_key_id", "custom_tags"}
var workspaceRunningUpdatesAllowed = []string{"credentials_id", "network_id", "storage_customer_managed_key_id", "private_access_settings_id", "managed_services_customer_managed_key_id", "custom_tags", "expected_workspace_status"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't your fault, but would you mind formatting this list with one entry per line? Changes to this list are easier to review this way.

func (a WorkspacesAPI) WaitForRunning(ws Workspace, timeout time.Duration) error {
// If expected_workspace_status is specified, WaitForExpectedStatus will wait until workspace is in the expected status.
// If not, it will wait until workspace is running, and otherwise will try to explain why it failed.
func (a WorkspacesAPI) WaitForExpectedStatus(ws Workspace, expected_status string, timeout time.Duration) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (a WorkspacesAPI) WaitForExpectedStatus(ws Workspace, expected_status string, timeout time.Duration) error {
func (a WorkspacesAPI) WaitForExpectedStatus(ws Workspace, expectedStatus string, timeout time.Duration) error {

err = workspacesAPI.WaitForRunning(workspace, d.Timeout(schema.TimeoutRead))
// The expected_workspace_status field is input only.
// Therefore, we need to read it from the original Terraform configuration.
expectedStatus := d.Get("expected_workspace_status").(string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expectedStatus := d.Get("expected_workspace_status").(string)
// PROVISIONING workspace import may fail because the "expected_workspace_status" is not included in the state during import, nor is it returned by the API.
// As a result, the provider will (likely) wait for RUNNING state, which will never happen, and timeout.
// TODO: fix thisl.
expectedStatus := d.Get("expected_workspace_status").(string)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment

Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

  • PR number: 5019
  • Commit SHA: 775ba1456d22a09558e96ebb828bbd45f0bd1355

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants