feat(cli): add crash reporting with trace ids#2880
feat(cli): add crash reporting with trace ids#2880matthewvolk wants to merge 2 commits intoalphafrom
Conversation
Centralized error handling via withErrorHandler HOF, singleton telemetry with UUID trace ids, X-Correlation-Id headers on all API calls, and global uncaught exception handlers. On any CLI error, a trace ID is displayed for support debugging.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
chanceaclark
left a comment
There was a problem hiding this comment.
I feel like a lot of this PR can be simplified. Let's work together tomorrow to trim it down.
.changeset/brave-tigers-trace.md
Outdated
There was a problem hiding this comment.
You can omit the changeset here
| properties: { | ||
| ...payload, | ||
| sessionId: this.sessionId, | ||
| traceId: this.traceId(), |
There was a problem hiding this comment.
Trace id and session id above are the same thing. Can we just normalize this all to use session id internally but show it as trace id externally?
| @@ -46,5 +45,8 @@ export const telemetryPreHook = async (command: Command) => { | |||
| }; | |||
|
|
|||
| export const telemetryPostHook = async () => { | |||
There was a problem hiding this comment.
Are there arguments on this that we can determine if the command failed with an error? It would be cleaner than having to add a closure around each command and is more scalable.
There was a problem hiding this comment.
Ok I was actually just playing around with this in a sandbox and here's what I found:
const { Command } = require('commander');
const p = new Command();
p.command('test').action(async () => { throw new Error('boom'); });
p.hook('postAction', () => { console.log('POST_ACTION_FIRED'); });
p.parseAsync(['node', 'test', 'test']).catch(e => console.log('CAUGHT:', e.message));If you run that, you don't see POST_ACTION_FIRED. I guess hooks in Commander work like .then()?
TL;DR... to your original question, I don't think so... but open to your thoughts
| }); | ||
| } | ||
|
|
||
| async trackError(commandName: string, error: unknown) { |
There was a problem hiding this comment.
Do we need to add a whole new function if we are only calling it once? Can we just use the existing track function?
| } | ||
| } | ||
|
|
||
| let telemetryInstance: Telemetry | undefined; |
There was a problem hiding this comment.
Can you explain why we need a singleton here?
There was a problem hiding this comment.
Right now it's because the pre-hook, post-hook, error handler, and command action callback (for X-Correlation-Id) all call getTelemetry() independently — without the singleton they'd each get a fresh sessionId... though after I incorporate your feedback to leverage the postHook more, maybe this changes
There was a problem hiding this comment.
I think a fresh session per command call is expected? As in you call the CLI once, you get a new session, then call it again it's a separate session.
|
@chanceaclark just updated, let me know what you think 👍 |
eb7d627 to
d9d52a4
Compare
What/Why?
When the CLI crashes, customers send screenshots to support which aren't actionable. This PR instruments the CLI so that on any error, a trace ID (UUID) is displayed. The customer shares it with support, who can look up the session across telemetry systems.
This works because every outgoing BigCommerce API request now includes an
X-Correlation-Idheader set to the trace ID.Changes
Centralized error handling
withErrorHandler()HOF that wraps Commander.action()callbacksconsola.error+process.exit(1)from all 6 command actions (deploy, build, dev, start, project list/create/link)Telemetry singleton
getTelemetry()/resetTelemetry()singleton pattern replaces per-modulenew Telemetry()instancessessionIdchanged fromrandomBytes(32).toString('hex')torandomUUID()for a human-readable, support-friendly formattraceId(),durationMs(), andtrackError()methodstrack()calls enriched withtraceId,nodeVersion,platform,arch,cliVersionAPI correlation
X-Correlation-Idheader added to all 6 BigCommerce API fetch calls (4 in deploy.ts, 2 in project.ts)Global safety net
uncaughtExceptionandunhandledRejectionhandlers inindex.tsUser-facing output on crash:
Testing
error-handler.spec.tsandtelemetry.spec.tsdeploy.spec.tsandproject.spec.tsMigration
N/A — no breaking changes to public API or file structure.