-
Notifications
You must be signed in to change notification settings - Fork 14.1k
ThreadId generation fallback path: avoid spurious yields #149481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, r=me with or without the nit.
library/std/src/thread/id.rs
Outdated
| while COUNTER_LOCKED.compare_exchange_weak(false, true, Ordering::Acquire, Ordering::Relaxed).is_err() { | ||
| // Miri doesn't like it when we yield here as it interferes with deterministically | ||
| // scheduling threads, so use `compare_exchange` to avoid spurious yields. | ||
| while COUNTER_LOCKED.compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed).is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I'd use swap(true) here, as it can generate better assembly on platforms that don't have a CAS instruction, but do have swap (like RISC-V, even though it always has 64-bit atomics if I understand correctly).
bc59557 to
1fccfa6
Compare
|
Like this? |
|
Yes, exactly! |
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to rust-lang#149476 Cc `@orlp` `@joboet`
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to #149476 Cc `@orlp` `@joboet`
This comment has been minimized.
This comment has been minimized.
|
💔 Test failed - checks-actions |
1fccfa6 to
14830bf
Compare
|
I build-checked it this time. 🙈 @bors r=joboet |
|
@joboet If there is interest, I also have this lock-free version: #![feature(allocator_api)]
use std::alloc::System;
use std::ptr;
use std::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
struct State {
base_value: u64,
increment: AtomicUsize,
prev: AtomicPtr<State>,
}
impl State {
fn new_ptr(base_value: u64, prev: *mut State) -> *mut State {
Box::into_raw_with_allocator(Box::new_in(
State {
base_value,
increment: AtomicUsize::new(0),
prev: AtomicPtr::new(prev),
},
System,
))
.0
}
unsafe fn drop_ptr(mut state: *mut State) {
unsafe {
while state != ptr::null_mut() {
let next = (*state).prev.load(Ordering::Acquire);
unsafe { drop(Box::from_raw_in(state, System)) };
state = next;
}
}
}
}
// A lock-free 64-bit counter using just AtomicPtr and AtomicUsize.
pub struct LockFreeU64Counter {
state: AtomicPtr<State>,
in_progress_increments: AtomicUsize,
}
impl LockFreeU64Counter {
pub fn new() -> Self {
LockFreeU64Counter {
state: AtomicPtr::new(State::new_ptr(0, ptr::null_mut())),
in_progress_increments: AtomicUsize::new(0),
}
}
/// Increments the counter and returns the old value.
///
/// Returns None if the counter has reached its maximum value.
pub fn increment(&self) -> Option<u64> {
self.in_progress_increments.fetch_add(1, Ordering::Relaxed);
loop {
let current_state_ptr = self.state.load(Ordering::Acquire);
let current_state = unsafe { &*current_state_ptr };
let base_value = current_state.base_value;
let increment = current_state.increment.load(Ordering::Relaxed);
let cur_value = base_value.checked_add(increment as u64)?;
if increment == usize::MAX {
let new_state_ptr = State::new_ptr(cur_value, current_state_ptr);
let cx = self.state.compare_exchange(
current_state_ptr,
new_state_ptr,
Ordering::AcqRel,
Ordering::Relaxed,
);
if cx.is_err() {
unsafe {
(*new_state_ptr).prev.store(ptr::null_mut(), Ordering::Relaxed);
State::drop_ptr(new_state_ptr)
};
}
continue;
}
let cx = current_state.increment.compare_exchange(
increment,
increment + 1,
Ordering::Relaxed,
Ordering::Relaxed,
);
if cx.is_err() {
continue;
}
if self.in_progress_increments.fetch_sub(1, Ordering::AcqRel) == 1 {
if current_state.prev.load(Ordering::Relaxed) != ptr::null_mut() {
let prev = current_state.prev.swap(ptr::null_mut(), Ordering::Acquire);
unsafe { State::drop_ptr(prev) };
}
}
return Some(cur_value);
}
}
}
impl Drop for LockFreeU64Counter {
fn drop(&mut self) {
unsafe { State::drop_ptr(self.state.load(Ordering::Relaxed)) };
}
}EDIT: as ralf points out this has a bug, working on a fix. |
|
@orlp I think that has a potential UAF since you are using |
|
@RalfJung I had already spotted another bug, but yes that is also an issue, let me fix that. |
|
@RalfJung Please see the new implementation of the lock-free counter, which has a completely separate freelist: #![feature(allocator_api)]
use std::alloc::System;
use std::cell::UnsafeCell;
use std::ptr;
use std::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
struct State {
base_value: u64,
increment: AtomicUsize,
// UnsafeCell because the freelist accesses this field exclusively while
// the rest of the state may still be accessed concurrently.
next_free: UnsafeCell<*mut State>,
}
impl State {
fn new_ptr(base_value: u64) -> *mut State {
Box::into_raw_with_allocator(Box::new_in(
State {
base_value,
increment: AtomicUsize::new(0),
next_free: UnsafeCell::new(ptr::null_mut()),
},
System,
))
.0
}
unsafe fn drop_ptr(state: *mut State) {
unsafe { drop(Box::from_raw_in(state, System)); }
}
}
// A lock-free 64-bit counter using just AtomicPtr and AtomicUsize.
pub struct LockFreeU64Counter {
state: AtomicPtr<State>,
in_progress_increments: AtomicUsize,
free_head: AtomicPtr<State>,
}
impl LockFreeU64Counter {
pub fn new() -> Self {
LockFreeU64Counter {
state: AtomicPtr::new(State::new_ptr(0)),
in_progress_increments: AtomicUsize::new(0),
free_head: AtomicPtr::new(ptr::null_mut()),
}
}
/// Increments the counter and returns the old value.
///
/// Returns None if the counter has reached its maximum value.
pub fn increment(&self) -> Option<u64> {
// Acquire to ensure we don't see an old state that was reclaimed.
self.in_progress_increments.fetch_add(1, Ordering::Acquire);
loop {
let current_state_ptr = self.state.load(Ordering::Acquire);
let current_state = unsafe { &*current_state_ptr };
let base_value = current_state.base_value;
let increment = current_state.increment.load(Ordering::Acquire);
let cur_value = base_value.checked_add(increment as u64)?;
if increment == usize::MAX {
let new_state_ptr = State::new_ptr(cur_value);
let cx = self.state.compare_exchange(
current_state_ptr,
new_state_ptr,
Ordering::AcqRel,
Ordering::Relaxed,
);
match cx {
Ok(_) => unsafe { self.add_to_freelist(current_state_ptr) },
Err(_) => unsafe { State::drop_ptr(new_state_ptr) },
}
continue;
}
let cx = current_state.increment.compare_exchange(
increment,
increment + 1,
Ordering::Release,
Ordering::Relaxed,
);
if cx.is_err() {
continue;
}
if self.in_progress_increments.fetch_sub(1, Ordering::AcqRel) == 1 {
// Pass base_value as the limit to avoid reclaiming states that were added after
// we established in_progress_increments == 0, which can still be accessed.
unsafe { self.try_reclaim_freelist(base_value) };
}
return Some(cur_value);
}
}
unsafe fn try_reclaim_freelist(&self, base_value_limit: u64) {
if self.free_head.load(Ordering::Relaxed) == ptr::null_mut() {
return;
}
unsafe {
let mut head = self.free_head.swap(ptr::null_mut(), Ordering::Acquire);
let mut filtered_head = ptr::null_mut();
while head != ptr::null_mut() {
let next = *(*head).next_free.get_mut();
if (*head).base_value < base_value_limit {
State::drop_ptr(head);
head = next;
} else {
*(*head).next_free.get_mut() = filtered_head;
filtered_head = head;
head = next;
}
}
if filtered_head != ptr::null_mut() {
self.add_to_freelist(filtered_head);
}
}
}
unsafe fn add_to_freelist(&self, state: *mut State) {
unsafe {
let mut state_tail = state;
while *(*state_tail).next_free.get_mut() != ptr::null_mut() {
state_tail = *(*state_tail).next_free.get_mut();
}
let mut head = self.free_head.load(Ordering::Acquire);
loop {
*(*state_tail).next_free.get_mut() = head;
match self.free_head.compare_exchange(
head,
state,
Ordering::Release,
Ordering::Acquire,
) {
Ok(_) => break,
Err(new_head) => head = new_head,
}
}
}
}
}
impl Drop for LockFreeU64Counter {
fn drop(&mut self) {
unsafe {
let mut head = *self.free_head.get_mut();
while head != ptr::null_mut() {
let next = *(*head).next_free.get_mut();
State::drop_ptr(head);
head = next;
}
State::drop_ptr(*self.state.get_mut());
}
}
} |
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to #149476 Cc `@orlp` `@joboet`
|
💥 Test timed out |
|
@bors retry |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing dfe1b8c (parent) -> 9b82a4f (this PR) Test differencesShow 2 test diffs2 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 9b82a4fffe2b215f488a6dfdc0b508235f37c85c --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (9b82a4f): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 472.475s -> 474.873s (0.51%) |
Fixes rust-lang/miri#4737
Alternative to #149476
Cc @orlp @joboet