Skip to content

fix: tfpluginsdk nil panics when LateInit off#600

Open
jwefers wants to merge 1 commit intocrossplane:mainfrom
jwefers:fix/tfpluginsdk-partial-values
Open

fix: tfpluginsdk nil panics when LateInit off#600
jwefers wants to merge 1 commit intocrossplane:mainfrom
jwefers:fix/tfpluginsdk-partial-values

Conversation

@jwefers
Copy link

@jwefers jwefers commented Feb 19, 2026

Description of your changes

This solves an issue we encountered when converting a (not-yet-opensourced sadly) provider for auth0 to using tfpluginsdk direct invocation.
terraform-provider-auth0, in its code paths for resource Updates (schema.Resource.UpdateContext) expects fully filled schemas. If there is a nil anywhere, it panics. We've seen this for arrays in particular.

We also mostly turn off LateInitialize for our resources, as we only want partial control of resources via Crossplane.

In the old Terraform CLI invocation mode, all worked well, because Terraform constructs a value for the provider based on the last observed state and then merging the desired config into it.

Contrary, upjet tfpluginsdk code did not do this: It constructed the values that are passed to the tf providers UpdateContext func purely based on the current spec.forProvider. Which, without LateInit, is likely partial. This led to those panics.

This MR is vibecoded, full discosure, and emulates TF behavior by constructing the value for the TF provider based on the last observed state, then applying the desired spec over it.

I have NOT looked at the tfpluginfw code if that fix is needed there as well

I have:

  • Read and followed Upjet's contribution process.
  • [x Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Using our own provider-auth0 with full tfpluginsdk usage with this PR in prod with hundreds of resources, most having LateInit off and partial .spec.forProviders

@jwefers jwefers changed the title fix: tfpluginsdk sending partial terraform resources to provider code fix: tfpluginsdk nil panics when LateInit off Feb 19, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

Adds a defensive pre-processing step in the Terraform instance diff computation that merges the observed instance state with existing desired parameters before creating the resource configuration. Introduces a new helper method to convert instance state to a JSON map suitable for diff operations.

Changes

Cohort / File(s) Summary
Terraform Diff Computation
pkg/controller/external_tfpluginsdk.go
Added pre-processing logic that merges observed instance state (converted to JSON) with existing desired parameters before diff computation. Introduced fromInstanceStateToJSONMapForDiff helper method to handle instance state conversion. The augmented parameter set is now used for resource configuration creation and diff generation, while maintaining existing diff calculation and timeout handling flows.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Template Breaking Changes ⚠️ Warning Changes to external_tfpluginsdk.go affect all generated providers using tfpluginsdk, but testing validation is limited to terraform-provider-auth0 only. Test this foundational change with at least 2-3 additional providers (AWS, GCP, Azure) before merging to ensure no regressions across different Terraform implementations.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the specific fix: preventing nil panics in tfpluginsdk when LateInit is disabled, which matches the main change in the changeset.
Description check ✅ Passed The description thoroughly explains the problem (partial values causing panics in terraform-provider-auth0), the solution (merging observed state with desired config), and testing approach, all directly related to the changeset.
Configuration Api Breaking Changes ✅ Passed PR modifies only pkg/controller/external_tfpluginsdk.go with a private helper method; no changes to pkg/config/** or exported APIs.
Generated Code Manual Edits ✅ Passed No files matching the zz_*.go pattern were found in the repository, and the modified file pkg/controller/external_tfpluginsdk.go is regular source code, not generated code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/controller/external_tfpluginsdk.go (1)

841-852: fromInstanceStateToJSONMapForDiff is a near-verbatim duplicate of fromInstanceStateToJSONMap.

The only difference from the existing fromInstanceStateToJSONMap (Lines 828-839) is that this variant drops the cty.Value return. You could eliminate this new method entirely by delegating to the existing one and discarding the second return value — both in the helper and at the call site.

♻️ Proposed refactor: delegate to existing method

Remove fromInstanceStateToJSONMapForDiff entirely and update the call site in getResourceDataDiff:

-func (n *terraformPluginSDKExternal) fromInstanceStateToJSONMapForDiff(newState *tf.InstanceState) (map[string]interface{}, error) {
-	impliedType := n.config.TerraformResource.CoreConfigSchema().ImpliedType()
-	attrsAsCtyValue, err := newState.AttrsAsObjectValue(impliedType)
-	if err != nil {
-		return nil, errors.Wrap(err, "could not convert attrs to cty value")
-	}
-	stateValueMap, err := schema.StateValueToJSONMap(attrsAsCtyValue, impliedType)
-	if err != nil {
-		return nil, errors.Wrap(err, "could not convert instance state value to JSON")
-	}
-	return stateValueMap, nil
-}

And at the call site (Line 445):

-		stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
+		stateValueMap, _, err := n.fromInstanceStateToJSONMap(s)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/external_tfpluginsdk.go` around lines 841 - 852, The function
fromInstanceStateToJSONMapForDiff is a near-duplicate of
fromInstanceStateToJSONMap; remove fromInstanceStateToJSONMapForDiff and update
the caller(s) (e.g., getResourceDataDiff) to call fromInstanceStateToJSONMap and
ignore the extra cty.Value return (e.g., _, err :=
fromInstanceStateToJSONMap(...)); preserve existing error handling and return
behavior from the original method when propagating errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/external_tfpluginsdk.go`:
- Around line 444-455: The conversion error from
fromInstanceStateToJSONMapForDiff is being dropped because err is scoped inside
the inner if and never logged or returned, causing a silent fallback to
n.params; change the logic so the err is captured in the outer scope, and if
fromInstanceStateToJSONMapForDiff returns an error emit a warning/debug log
(using the controller's logger already in this file, e.g., r.logger or similar)
that includes the error and context (resourceExists, n.Name or similar), then
proceed with the graceful fallback to params = n.params (or alternatively return
the error from the surrounding function/Observe if you prefer surfacing
failures); ensure you still merge stateValueMap into params when no error
occurs.
- Around line 435-456: The framework implementation reuses a single sparse
resourceTerraformConfigValue set in Connect(), which causes nulls from
spec.forProvider to override prior observed state when building the proposed
state; fix by merging the observed instance state into the config before
diffing: in Observe() (or immediately before getDiffPlanResponse()) read the
current instance state, convert it to a JSON map the same way
external_tfpluginsdk.go does (e.g., fromInstanceStateToJSONMapForDiff
equivalent), and overlay the static resourceTerraformConfigValue entries onto
that merged map so previously-computed/non-specified attributes keep their
observed values; alternatively, change proposedState() to prefer the observed
state value when config value is nil for non-computed attributes (replace the
"newV = configV" branch to guard against nil configV by using stateV if configV
is nil), referencing resourceTerraformConfigValue, Observe(), proposedState(),
and getDiffPlanResponse() to locate where to apply the change.

---

Nitpick comments:
In `@pkg/controller/external_tfpluginsdk.go`:
- Around line 841-852: The function fromInstanceStateToJSONMapForDiff is a
near-duplicate of fromInstanceStateToJSONMap; remove
fromInstanceStateToJSONMapForDiff and update the caller(s) (e.g.,
getResourceDataDiff) to call fromInstanceStateToJSONMap and ignore the extra
cty.Value return (e.g., _, err := fromInstanceStateToJSONMap(...)); preserve
existing error handling and return behavior from the original method when
propagating errors.

Comment on lines 435 to +456
func (n *terraformPluginSDKExternal) getResourceDataDiff(tr resource.Terraformed, ctx context.Context, s *tf.InstanceState, resourceExists bool) (*tf.InstanceDiff, error) { //nolint:gocyclo
resourceConfig := tf.NewResourceConfigRaw(n.params)
// before we calculate the diff and thus what we send over into the terraform provider
// code, we construct the desired resource fully specified, by taking the observed
// state from the upstream resource and then apply the desired spec.forProvider of the crossplane
// resource over it. This mirrors what terraform/opentofu do, because some terraform providers
// do not like nil values when processing the desired resource. This is especially important
// when managementPolicies have LateInitialize excluded, which means spec.forProvider can
// be very sparse.
params := n.params
if resourceExists && s != nil && len(n.params) > 0 {
stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
if err == nil && len(stateValueMap) > 0 {
params = make(map[string]any, len(stateValueMap)+len(n.params))
for k, v := range stateValueMap {
params[k] = v
}
for k, v := range n.params {
params[k] = v
}
}
}
resourceConfig := tf.NewResourceConfigRaw(params)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find external_tfpluginfw.go and check its structure
fd -e go -g '*tfpluginfw*'

Repository: crossplane/upjet

Length of output: 184


🏁 Script executed:

#!/bin/bash
# Search for getResourceDataDiff method in Framework implementation
rg "getResourceDataDiff" --type go

Repository: crossplane/upjet

Length of output: 432


🏁 Script executed:

#!/bin/bash
# Get the full content of external_tfpluginfw.go to understand the diff computation
fd -e go -g '*tfpluginfw*' -x cat {} -n

Repository: crossplane/upjet

Length of output: 50372


🏁 Script executed:

#!/bin/bash
# Look for the Observe method in external_tfpluginfw.go to see how it computes diffs
rg -A 50 "func \(n \*terraformPluginFrameworkExternalClient\) Observe" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 2386


🏁 Script executed:

#!/bin/bash
# Search for any state-related or diff-related logic in the Framework implementation
rg "getDiffPlanResponse|filteredDiffExists|proposed|stateValueMap" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 2401


🏁 Script executed:

#!/bin/bash
# Look for the proposedState function to understand how it computes proposed state
rg -B 5 -A 20 "func proposedState" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check how getDiffPlanResponse is called in Observe method and if state merging happens
rg -B 10 -A 30 "planResponse, hasDiff, err := n.getDiffPlanResponse" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 1677


🏁 Script executed:

#!/antml:thinking
Also need to understand:
1. Is there state merging before getDiffPlanResponse is called?
2. Look at the full Observe method to see the complete flow
</thinking>
#!/bin/bash
# Get full Observe method to see complete flow
rg -A 150 "func \(n \*terraformPluginFrameworkExternalClient\) Observe\(ctx context.Context, mg xpresource.Managed\)" pkg/controller/external_tfpluginfw.go | head -200

Repository: crossplane/upjet

Length of output: 289


🏁 Script executed:

#!/bin/bash
# Look for proposedState function definition
rg -B 2 -A 15 "^func proposedState" pkg/controller/

Repository: crossplane/upjet

Length of output: 1342


🏁 Script executed:

#!/bin/bash
# Check if Framework implementation merges state like SDK does before calling getDiffPlanResponse
rg -B 20 "planResponse, hasDiff, err := n.getDiffPlanResponse" pkg/controller/external_tfpluginfw.go | grep -A 20 "stateValueMap\|params\|merge"

Repository: crossplane/upjet

Length of output: 583


🏁 Script executed:

#!/bin/bash
# Let's see the full context around where getDiffPlanResponse is called
sed -n '400,500p' pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 4016


🏁 Script executed:

#!/bin/bash
# Check if Framework's Connect method merges state like SDK version does
rg -A 5 "resourceTerraformConfigValue.*=" pkg/controller/external_tfpluginfw.go | head -20

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Compare: look at SDK version's Connect to see when params are set
rg -B 5 -A 10 "n.params = " pkg/controller/external_tfpluginsdk.go | head -30

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Look at the full getDiffPlanResponse in SDK to see the state merging
rg -B 5 -A 80 "func \(n \*terraformPluginSDKExternal\) getResourceDataDiff" pkg/controller/external_tfpluginsdk.go

Repository: crossplane/upjet

Length of output: 3494


🏁 Script executed:

#!/bin/bash
# Look at the full proposedNew implementation to understand the merging logic
rg -A 100 "^func proposedNew" pkg/controller/proposed_state.go | head -150

Repository: crossplane/upjet

Length of output: 5555


🏁 Script executed:

#!/antml:thinking>
I need to see the full proposedNew implementation to understand if it truly merges state with config in a way that prevents the partial-value panic.
</thinking>

Repository: crossplane/upjet

Length of output: 204


🏁 Script executed:

#!/an/bin/bash
# Check how n.resourceTerraformConfigValue is set in Framework's Connect method
rg -B 5 -A 15 "resourceTerraformConfigValue:" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 998


🏁 Script executed:

#!/bin/bash
# Check if resourceTerraformConfigValue is ever updated after Connect
rg "resourceTerraformConfigValue\s*=" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Look at how params is constructed in Framework's Connect vs SDK's Connect
rg -A 5 "params.*:=" pkg/controller/external_tfpluginfw.go | head -40

Repository: crossplane/upjet

Length of output: 405


🏁 Script executed:

#!/bin/bash
# See the full picture of what happens with params in Framework Connect
sed -n '160,260p' pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 4770


🏁 Script executed:

#!/bin/bash
# Check if resourceConfigTFValue/resourceTerraformConfigValue is static or gets updated
rg "SetReconstructedFrameworkTFState\|SetFrameworkTFState" pkg/controller/external_tfpluginfw.go | head -20

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check what happens in getDiffPlanResponse - does it merge state with config like SDK does?
sed -n '350,410p' pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 2864


🏁 Script executed:

#!/bin/bash
# Compare: In SDK's getResourceDataDiff, state is merged. Does Framework do similar merging?
# Look at proposedState call to see what gets passed
rg -B 3 -A 3 "proposedState\(n.resourceSchema" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 489


🏁 Script executed:

#!/bin/bash
# Check Framework's Create method to see how it handles partial params
rg -A 80 "func \(n \*terraformPluginFrameworkExternalClient\) Create\(ctx context.Context" pkg/controller/external_tfpluginfw.go | head -100

Repository: crossplane/upjet

Length of output: 3828


🏁 Script executed:

#!/antml:thinking>
I need to check if Create/Update methods in Framework use getDiffPlanResponse or if they have a different code path that might not do the state merging.
</thinking>

Repository: crossplane/upjet

Length of output: 204


🏁 Script executed:

#!/bin/bash
# Check Framework's Update method
rg -A 60 "func \(n \*terraformPluginFrameworkExternalClient\) Update\(ctx context.Context" pkg/controller/external_tfpluginfw.go | head -80

Repository: crossplane/upjet

Length of output: 2802


🏁 Script executed:

#!/bin/bash
# Look at proposedNewAttributes to understand how it merges prior and config
rg -A 50 "^func proposedNewAttributes" pkg/controller/proposed_state.go

Repository: crossplane/upjet

Length of output: 1997


🏁 Script executed:

#!/bin/bash
# Verify: In SDK's Connect, are params initially from spec.forProvider (potentially sparse)?
# Look at how params are constructed in SDK Connect
rg -B 5 -A 20 "func \(c \*TerraformSDKConnector\) Connect\(ctx context.Context" pkg/controller/external_tfpluginsdk.go | head -50

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check if Framework's resourceTerraformConfigValue gets updated during Observe or stays static
rg "resourceTerraformConfigValue" pkg/controller/external_tfpluginfw.go

Repository: crossplane/upjet

Length of output: 771


🏁 Script executed:

#!/bin/bash
# Check if Framework's proposedState provides same protection as SDK's state merge
# Look at the exact logic in proposedNewAttributes for non-computed attributes
sed -n '90,150p' pkg/controller/proposed_state.go

Repository: crossplane/upjet

Length of output: 2368


🏁 Script executed:

#!/bin/bash
# Confirm: Does Framework ever update params/config after initial Connect?
# Search for any updates to params or resourceTerraformConfigValue after Connect
rg "params\s*=" pkg/controller/external_tfpluginfw.go | grep -v "resourceTerraformConfigValue:"

Repository: crossplane/upjet

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Look at SDK's comment about why state merging is needed
sed -n '430,465p' pkg/controller/external_tfpluginsdk.go | head -40

Repository: crossplane/upjet

Length of output: 1733


Framework implementation (external_tfpluginfw.go) has the same partial-value risk and needs similar protection.

Unlike the SDK implementation which re-merges state into params on each getResourceDataDiff() call, the Framework implementation sets resourceTerraformConfigValue once in Connect() from the sparse spec.forProvider, then reuses it statically throughout the resource lifecycle.

The proposedState() function does attempt to merge prior state with config, but for non-computed attributes it explicitly prefers config values even when null:

} else {
    // For non-computed attributes, we always take the config value,
    // even if it is null.
    newV = configV
}

When spec.forProvider is sparse (due to managementPolicies with LateInitialize excluded), this means optional fields not in the config remain null in the proposed state sent to the provider—the exact scenario the SDK fix addresses.

Consider opening a follow-up PR to apply a similar fix: either re-merge state into params during Observe(), or update resourceTerraformConfigValue dynamically with observed state before calling getDiffPlanResponse().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/external_tfpluginsdk.go` around lines 435 - 456, The framework
implementation reuses a single sparse resourceTerraformConfigValue set in
Connect(), which causes nulls from spec.forProvider to override prior observed
state when building the proposed state; fix by merging the observed instance
state into the config before diffing: in Observe() (or immediately before
getDiffPlanResponse()) read the current instance state, convert it to a JSON map
the same way external_tfpluginsdk.go does (e.g.,
fromInstanceStateToJSONMapForDiff equivalent), and overlay the static
resourceTerraformConfigValue entries onto that merged map so
previously-computed/non-specified attributes keep their observed values;
alternatively, change proposedState() to prefer the observed state value when
config value is nil for non-computed attributes (replace the "newV = configV"
branch to guard against nil configV by using stateV if configV is nil),
referencing resourceTerraformConfigValue, Observe(), proposedState(), and
getDiffPlanResponse() to locate where to apply the change.

Comment on lines +444 to +455
if resourceExists && s != nil && len(n.params) > 0 {
stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
if err == nil && len(stateValueMap) > 0 {
params = make(map[string]any, len(stateValueMap)+len(n.params))
for k, v := range stateValueMap {
params[k] = v
}
for k, v := range n.params {
params[k] = v
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The conversion error is silently discarded, leaving a silent fallback to the pre-fix panic path.

When fromInstanceStateToJSONMapForDiff returns an error (e.g. a schema mismatch), the err variable is scoped to the inner if block and never logged or returned — the code silently falls back to params = n.params, which is precisely the partial-value path this PR is fixing. There is no signal to a provider developer (or Upjet maintainer) that the state merge was skipped.

At minimum, emit a debug/warning log so the fallback is observable. Depending on your resilience preference, you could also return the error (which would surface it in Observe). What was the intent — always degrade gracefully, or only on truly transient conversion failures?

🛡️ Proposed fix: log the error before falling back
 	if resourceExists && s != nil && len(n.params) > 0 {
 		stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
-		if err == nil && len(stateValueMap) > 0 {
+		if err != nil {
+			n.logger.Debug("Failed to convert instance state to JSON map for diff; falling back to spec-only params",
+				"error", err)
+		} else if len(stateValueMap) > 0 {
 			params = make(map[string]any, len(stateValueMap)+len(n.params))
 			for k, v := range stateValueMap {
 				params[k] = v
 			}
 			for k, v := range n.params {
 				params[k] = v
 			}
 		}
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if resourceExists && s != nil && len(n.params) > 0 {
stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
if err == nil && len(stateValueMap) > 0 {
params = make(map[string]any, len(stateValueMap)+len(n.params))
for k, v := range stateValueMap {
params[k] = v
}
for k, v := range n.params {
params[k] = v
}
}
}
if resourceExists && s != nil && len(n.params) > 0 {
stateValueMap, err := n.fromInstanceStateToJSONMapForDiff(s)
if err != nil {
n.logger.Debug("Failed to convert instance state to JSON map for diff; falling back to spec-only params",
"error", err)
} else if len(stateValueMap) > 0 {
params = make(map[string]any, len(stateValueMap)+len(n.params))
for k, v := range stateValueMap {
params[k] = v
}
for k, v := range n.params {
params[k] = v
}
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/external_tfpluginsdk.go` around lines 444 - 455, The
conversion error from fromInstanceStateToJSONMapForDiff is being dropped because
err is scoped inside the inner if and never logged or returned, causing a silent
fallback to n.params; change the logic so the err is captured in the outer
scope, and if fromInstanceStateToJSONMapForDiff returns an error emit a
warning/debug log (using the controller's logger already in this file, e.g.,
r.logger or similar) that includes the error and context (resourceExists, n.Name
or similar), then proceed with the graceful fallback to params = n.params (or
alternatively return the error from the surrounding function/Observe if you
prefer surfacing failures); ensure you still merge stateValueMap into params
when no error occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant