Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 3 additions & 8 deletions api/include/opentelemetry/common/spin_lock_mutex.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,7 @@ class SpinLockMutex
*/
bool try_lock() noexcept
{
return !flag_.load(std::memory_order_relaxed) &&
!flag_.exchange(true, std::memory_order_acquire);
return !flag_.exchange(true, std::memory_order_acquire);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this more efficient? load(std::memory_order_relaxed) would avoid having a memory fence and wait for cache sync. Having said that the whole idea of SpinLock in user mode seems dubious especially in high contention scenarios.

}

/**
Expand All @@ -95,13 +94,9 @@ class SpinLockMutex
*/
void lock() noexcept
{
for (;;)
// Try once
while (!try_lock())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems less efficient under contention. Calling try_lock() in every iteration causes repeated atomic exchanges and cache invalidations. The previous version avoided that by doing the initial exchange outside the spin loop.

{
// Try once
if (!flag_.exchange(true, std::memory_order_acquire))
{
return;
}
// Spin-Fast (goal ~10ns)
for (std::size_t i = 0; i < SPINLOCK_FAST_ITERATIONS; ++i)
{
Comment on lines +94 to 98
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spin-fast loop at line 101 is now unreachable. The while (!try_lock()) loop continuously retries try_lock() without ever entering the body to execute the spin-fast logic. The original structure with explicit if (!flag_.exchange(...)) return; allowed falling through to the spin logic when the lock failed. This breaks the intended backoff behavior.

Copilot uses AI. Check for mistakes.
Expand Down
70 changes: 38 additions & 32 deletions api/test/common/spinlock_benchmark.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

#include <benchmark/benchmark.h>
#include <atomic>
#include <cstdint>
#include <mutex>
#include <thread>
#include <vector>

Expand All @@ -27,8 +27,8 @@ constexpr int TightLoopLocks = 10000;
//
// lock: A lambda denoting how to lock. Accepts a reference to `SpinLockType`.
// unlock: A lambda denoting how to unlock. Accepts a reference to `SpinLockType`.
template <typename SpinLockType, typename LockF, typename UnlockF>
inline void SpinThrash(benchmark::State &s, SpinLockType &spinlock, LockF lock, UnlockF unlock)
template <typename LockF, typename UnlockF>
void SpinThrash(benchmark::State &s, LockF lock, UnlockF unlock)
{
auto num_threads = s.range(0);
// Value we will increment, fighting over a spinlock.
Expand All @@ -49,9 +49,9 @@ inline void SpinThrash(benchmark::State &s, SpinLockType &spinlock, LockF lock,
// to ensure maximum thread contention.
for (int i = 0; i < TightLoopLocks; i++)
{
lock(spinlock);
lock();
value++;
unlock(spinlock);
unlock();
}
});
}
Expand All @@ -63,35 +63,35 @@ inline void SpinThrash(benchmark::State &s, SpinLockType &spinlock, LockF lock,
}

// Benchmark of full spin-lock implementation.
static void BM_SpinLockThrashing(benchmark::State &s)
void BM_SpinLockThrashing(benchmark::State &s)
{
SpinLockMutex spinlock;
SpinThrash(s, spinlock, [](SpinLockMutex &m) { m.lock(); }, [](SpinLockMutex &m) { m.unlock(); });
SpinThrash(s, [&] { spinlock.lock(); }, [&] { spinlock.unlock(); });
}

// Naive `while(try_lock()) {}` implementation of lock.
static void BM_NaiveSpinLockThrashing(benchmark::State &s)
void BM_NaiveSpinLockThrashing(benchmark::State &s)
{
SpinLockMutex spinlock;
SpinThrash(
s, spinlock,
[](SpinLockMutex &m) {
while (!m.try_lock())
s,
[&] {
while (!spinlock.try_lock())
{
// Left this comment to keep the same format on old and new versions of clang-format
}
},
[](SpinLockMutex &m) { m.unlock(); });
[&] { spinlock.unlock(); });
}

// Simple `while(try_lock()) { yield-processor }`
static void BM_ProcYieldSpinLockThrashing(benchmark::State &s)
void BM_ProcYieldSpinLockThrashing(benchmark::State &s)
{
SpinLockMutex spinlock;
SpinThrash<SpinLockMutex>(
s, spinlock,
[](SpinLockMutex &m) {
while (!m.try_lock())
SpinThrash(
s,
[&] {
while (!spinlock.try_lock())
{
#if defined(_MSC_VER)
YieldProcessor();
Expand All @@ -108,33 +108,33 @@ static void BM_ProcYieldSpinLockThrashing(benchmark::State &s)
#endif
}
},
[](SpinLockMutex &m) { m.unlock(); });
[&] { spinlock.unlock(); });
}

// SpinLock thrashing with thread::yield().
static void BM_ThreadYieldSpinLockThrashing(benchmark::State &s)
void BM_ThreadYieldSpinLockThrashing(benchmark::State &s)
{
#if defined(__cpp_lib_atomic_value_initialization) && \
__cpp_lib_atomic_value_initialization >= 201911L
std::atomic_flag mutex{};
#else
std::atomic_flag mutex = ATOMIC_FLAG_INIT;
alignas(8) std::atomic_flag mutex = ATOMIC_FLAG_INIT;
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The alignas(8) specifier appears to be added without explanation. If alignment is needed for performance, consider adding a comment explaining why 8-byte alignment is chosen, or remove it if it was added unintentionally.

Suggested change
alignas(8) std::atomic_flag mutex = ATOMIC_FLAG_INIT;
std::atomic_flag mutex = ATOMIC_FLAG_INIT;

Copilot uses AI. Check for mistakes.
#endif
SpinThrash<std::atomic_flag>(
s, mutex,
[](std::atomic_flag &l) {
uint32_t try_count = 0;
while (l.test_and_set(std::memory_order_acq_rel))
SpinThrash(
s,
[&]() {
while (mutex.test_and_set(std::memory_order_acq_rel))
{
++try_count;
if (try_count % 32)
{
std::this_thread::yield();
}
std::this_thread::yield();
}
std::this_thread::yield();
},
[](std::atomic_flag &l) { l.clear(std::memory_order_release); });
[&] { mutex.clear(std::memory_order_release); });
}

void BM_StdMutexCheck(benchmark::State &s)
{
std::mutex mtx;
SpinThrash(s, [&] { mtx.lock(); }, [&] { mtx.unlock(); });
}

// Run the benchmarks at 2x thread/core and measure the amount of time to thrash around.
Expand Down Expand Up @@ -162,6 +162,12 @@ BENCHMARK(BM_ThreadYieldSpinLockThrashing)
->MeasureProcessCPUTime()
->UseRealTime()
->Unit(benchmark::kMillisecond);
BENCHMARK(BM_StdMutexCheck)
->RangeMultiplier(2)
->Range(1, std::thread::hardware_concurrency())
->MeasureProcessCPUTime()
->UseRealTime()
->Unit(benchmark::kMillisecond);

} // namespace

Expand Down
Loading