Skip to content

Add libbacktrace fallback for stack traces on musl/Alpine#3581

Open
hanxizh9910 wants to merge 3 commits intovalkey-io:unstablefrom
hanxizh9910:libbacktrace-alpine-support
Open

Add libbacktrace fallback for stack traces on musl/Alpine#3581
hanxizh9910 wants to merge 3 commits intovalkey-io:unstablefrom
hanxizh9910:libbacktrace-alpine-support

Conversation

@hanxizh9910
Copy link
Copy Markdown
Contributor

@hanxizh9910 hanxizh9910 commented Apr 28, 2026

Problem

On Alpine Linux (musl-based), execinfo.h is not available, so Valkey produces no stack trace on crash. This makes debugging crashes on Alpine very difficult.

Solution

Add libbacktrace as a fallback for stack frame collection on systems without execinfo.h.

  • src/config.h: Split HAVE_BACKTRACE (generic capability) from HAVE_EXECINFO (execinfo-specific). When USE_LIBBACKTRACE is defined and HAVE_EXECINFO is not, HAVE_BACKTRACE is still set so the crash handler compiles in on Alpine.
  • src/debug.c:
    • Add valkey_backtrace() using backtrace_simple() as a drop-in replacement for backtrace() on musl. Macro-aliases backtrace() to valkey_backtrace() so no other code needs to change.
    • Fix fallback messages in symbolizeWithLibbacktrace to correctly distinguish between "execinfo fallback available" and "no fallback available".
  • src/server.c / src/server.h: Initialize backtrace_state once at startup via initLibbacktraceFrameState() to avoid calling malloc from the signal handler, which could deadlock if a crash occurs while the allocator lock is held.

Notes

Testing

Built on Alpine 3.23 with USE_LIBBACKTRACE=yes and triggered a crash via DEBUG SEGFAULT. Stack traces are produced correctly for all threads.

  • After the implementation:
Part of alpine implementation screenshot

@hanxizh9910 hanxizh9910 marked this pull request as draft April 28, 2026 21:12
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.64%. Comparing base (678a06d) to head (71ae67a).
⚠️ Report is 1 commits behind head on unstable.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #3581   +/-   ##
=========================================
  Coverage     76.63%   76.64%           
=========================================
  Files           160      160           
  Lines         80472    80472           
=========================================
+ Hits          61668    61674    +6     
+ Misses        18804    18798    -6     
Files with missing lines Coverage Δ
src/debug.c 54.83% <100.00%> (ø)
src/server.c 89.50% <ø> (-0.11%) ⬇️
src/server.h 100.00% <ø> (ø)

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hanxizh9910 hanxizh9910 force-pushed the libbacktrace-alpine-support branch from 8b39249 to 8bbe43f Compare April 29, 2026 22:58
Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
@hanxizh9910 hanxizh9910 force-pushed the libbacktrace-alpine-support branch from 836b39d to 71ae67a Compare April 30, 2026 00:36
@hanxizh9910 hanxizh9910 marked this pull request as ready for review April 30, 2026 01:35
@rainsupreme rainsupreme self-requested a review April 30, 2026 19:43
Copy link
Copy Markdown
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good change! Thanks for the fix 😊 I just had a couple ideas to simplify/document a little more.

I'm wondering why the stack trace in your testing was so short. Normally I'd expect it to be somewhat longer. Maybe if you built with CFLAGS="-g" there would be more info. I don't think this is too important though.

Comment thread src/debug.c
bt_frame_state = backtrace_create_state(NULL, 0, bt_simple_error_cb, NULL);
}

static int valkey_backtrace(void **trace, int max_size) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding a comment here explaining that we preallocate bt_frame_state because allocating is unsafe while handling an async signal (such as crash handling). This probably isn't immediately obvious and might help out the next person to read this code

Comment thread src/debug.c
#ifdef HAVE_BACKTRACE
#define BACKTRACE_MAX_SIZE 100

#if defined(USE_LIBBACKTRACE) && !defined(HAVE_EXECINFO)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When USE_LIBBACKTRACE is enabled, we should use backtrace_simple() from libbacktrace for frame collection unconditionally — not just when execinfo.h is missing. Both call _Unwind_Backtrace() under the hood, so there's no benefit to keeping the glibc path as a special case. This simplifies the guard here to just #ifdef USE_LIBBACKTRACE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants