Skip to content

Sibling errors should not be added after propagation #1184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: clarify-one-error-per-result-position
Choose a base branch
from

Conversation

benjie
Copy link
Member

@benjie benjie commented Jul 10, 2025

This PR is built on top of:


GraphQL.js output is not (currently) stable after an operation terminates: more errors may be added to the result after the promise has resolved!

Reproduction with `graphql` module `test.mts`
import type { ExecutionResult } from "graphql";
import {
  graphql,
  GraphQLInt,
  GraphQLNonNull,
  GraphQLObjectType,
  GraphQLSchema,
} from "graphql";

const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));

const Test = new GraphQLObjectType({
  name: "Test",
  fields: {
    a: {
      type: GraphQLInt,
      async resolve() {
        await sleep(0);
        throw new Error(`a`);
      },
    },
    b: {
      type: new GraphQLNonNull(GraphQLInt),
      async resolve() {
        await sleep(10);
        throw new Error(`b`);
      },
    },
    c: {
      type: GraphQLInt,
      async resolve() {
        await sleep(20);
        throw new Error(`c`);
      },
    },
  },
});

const Query = new GraphQLObjectType({
  name: "Query",
  fields: {
    test: {
      type: Test,
      resolve() {
        return {};
      },
    },
  },
});
const schema = new GraphQLSchema({
  query: Query,
});

const result = await graphql({
  schema,
  source: `{ test { a b c } }`,
});

console.log("Result:");
console.log();
console.log(JSON.stringify(result, null, 2));
await sleep(100);
console.log();
console.log("Exact same object 100ms later:");
console.log();
console.log(JSON.stringify(result, null, 2));
$ node test.mts 
Result:

{
  "errors": [
    { "message": "a", "path": ["test", "a"] },
    { "message": "b", "path": ["test", "b"] }
  ],
  "data": { "test": null }
}

Exact same object 100ms later:

{
  "errors": [
    { "message": "a", "path": ["test", "a"] },
    { "message": "b", "path": ["test", "b"] },
    { "message": "c", "path": ["test", "c"] }
  ],
  "data": { "test": null }
}

(I've formatted this output for brevity)

The reason for this: though we note in the spec that you may cancel sibling execution positions, we don't do that in GraphQL.js; and furthermore, we even process errors from the result and add them to the errors list!

This is particularly problematic for client-side "throw on error". Given this schema:

type Query {
  test: Test
}
type Test {
  a: Int  # Throws immediately
  b: Int! # Throws after 10ms
  c: Int  # Throws after 20ms
}

And the same spec-valid result as above:

{
  "errors": [
    { "message": "a", "path": ["test", "a"] },
    { "message": "b", "path": ["test", "b"] },
    { "message": "c", "path": ["test", "c"] }
  ],
  "data": { "test": null }
}

Technically the Test.b field is the field that caused data.test to be null - it's non-nullable, so it triggered error propagation - but without looking at the schema we can't determine this.

Solution: recommend that servers don't keep adding to errors after error propagation has occurred. This would mean:

  1. GraphQL.js won't keep adding to errors after the operation has "completed"
  2. We can throw the last error received that relates to the associated field, and trust that for an implementation following the recommendations it's going to be the one either from the field itself or from the field that triggered error propagation to this level.

@yaacovCR
Copy link
Contributor

Took a stab at the implementation in graphql-js within our 16.x.x line:

graphql/graphql-js#4458

Although part of me feels like an implementer with deep knowledge of the relative expected ordering of its resolvers could be theoretically confused by the missing errors such that this might belong in v17.

Thoughts?

Copy link
Contributor

@martinbonnin martinbonnin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this 👍

That being said, I think this problem highlights that the current algorithms are not 100% clear on the semantics of raising errors and cancellation.

I get that resolvers might not be cancellable but it's suprising to me that graphql-js code is still executed after the execution result is received by the caller.

Ideally, there is a prompt cancellation guarantee that every callback checks for cancellation and stops processing if cancelled. Doing so in the language-neutral spec sounds like a terrible head ache though 😄 Problem for another day!

Never mind, it's an issue even (and especially) in the absence of cancellation. Well, that sucks

@martinbonnin
Copy link
Contributor

Putting down my thoughts from yesterday's wg before I forget everything about them.

This is a significant issue but the ultimate fix is onError: NULL IMO. I'm currently leaning towards declaring bankrupcy on this specific issue:

If users can update their servers, they should add proper support for onError: NULL. If they can't, they won't be able to fix this issue anyways 🤷

This means when an error bubbles, there is no way to know which error triggered the bubbling. It's the existing behaviour. It's unfortunate but it is what it is. If you want to do better, migrate to onError: NULL.

Note: I would still change the graphql-js behaviour to not have the response change after the promise is resolved, this feels very surprising to me.

@yaacovCR
Copy link
Contributor

Relevant, in terms of cancellation, merged to v17:

@yaacovCR
Copy link
Contributor

@martinbonnin could you elaborate a bit more on the scenarios in which

And relying on the final error on that nulled path does not work?

@martinbonnin
Copy link
Contributor

Relevant, in terms of cancellation, merged to v17

I can still reproduce @benjie behaviour that the result changes after the promise has been resolved, even using 17.0.0-alpha.9:

$ node test.mts 
Result:

{
  "errors": [
    { "message": "a", "path": ["test", "a"] },
    { "message": "b", "path": ["test", "b"] }
  ],
  "data": { "test": null }
}

Exact same object 100ms later:

{
  "errors": [
    { "message": "a", "path": ["test", "a"] },
    { "message": "b", "path": ["test", "b"] },
    { "message": "c", "path": ["test", "c"] }
  ],
  "data": { "test": null }
}

This seems suprising to me. Is that expected?

could you elaborate a bit more on the scenarios in which graphql/graphql-js#4458 and relying on the final error on that nulled path does not work?

My understanding is that the problem we are trying to solve is allowing clients using graphql-toe to determine what error caused the null-bubbling without schema knowledge?

#4458 is indeed a solution to that problem.

My point is that it is an inferior solution to onError: null. It is potentially a breaking change (the same query now returns a different result) and also requires updating your server (same as onError: null) while not allowing fine-grained error-handling.

I'd rather focus our efforts and messaging on onError: null.

@yaacovCR
Copy link
Contributor

I can still reproduce @benjie behaviour that the result changes after the promise has been resolved, even using 17.0.0-alpha.9:

This seems suprising to me. Is that expected?

Yes, it needs graphql/graphql-js#4458 to solve that issue. What has been merged is eventual cancellation of the resolver cascade (in addition to triggering of passed abort signal merged separately). Just adding that your (and my) prompt cancellation aspirations, while not fulfilled in v17, have been pushed forward a bit.

Creating many fine grained abort controllers to immediately cancel turned out to be too much of a performance hit… and we didn’t think enough to care about the issue of spooky additional errors after completion.

My point is that it is an inferior solution to onError: null. It is potentially a breaking change (the same query now returns a different result) and also requires updating your server (same as onError: null) while not allowing fine-grained error-handling.

Got it.

I'd rather focus our efforts and messaging on onError: null.

Shouldn’t we fix this behavior for all onError modes?

@martinbonnin
Copy link
Contributor

Yes, it needs graphql/graphql-js#4458 to solve that issue

Gotcha 👍 . If I may nitpick the terminology a bit here, my point is that:

  • Sibling errors should not be added after **cancellation** => this is needed.
  • Sibling errors should not be added after **propagation** => this is probably not needed.

Creating many fine grained abort controllers to immediately cancel turned out to be too much of a performance hit…

Apologies in advance for the naive question but since JS is ultimately single threaded, shouldn't checking for cancellation be reading a single per-field flag? Or is this what is actually slow?

Shouldn’t we fix this behavior for all onError modes?

I'm fine and happy to let it go for the current onError: PROPAGATE mode.

Fixing it means that some current queries will see a different response (some errors will disappear). As with every change, it might break someone's workflow. It's more work for us, new entries to process in the graphql-js changelog for everyone, all of that for something that IMO should become the "legacy" error mode.

I say it's not worth the tradeoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants