and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers
I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
On 5/26/2025 2:58 PM, jseigh wrote:
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-
pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers
I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
Interesting, thanks Joe! Fwiw, I think, something like FlushProcessWriteBuffers() should be in a new C++ std? Perhaps, allow a system to use DWCAS without resorting to not lock-free wrt the checks?
Damn it. ;^o Ouch.
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/ nf-processthreadsapi-flushprocesswritebuffers
Sometimes I think that function was specifically designed for exotic
RCU, ect...
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-pointers- using-std-atomics/
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers
I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
Joe Seigh
On 5/26/2025 2:58 PM, jseigh wrote:
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-
pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers
I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
Joe Seigh
Actually wrt:
_______________
hz_ptr.store(INDETERMINATE, std::memory_order_relaxed);
T* local = ptr.load(std::memory_order_relaxed);
hz_ptr.store(local, std::memory_order_relaxed);
_______________
For some damn reason, it kind of reminds me of an older way to use only single word exchange to push into a lock-free stack.
Iirc, something like this pseudo code for push:
_____________
cur->next = WAIT_STATE;
node* prev = exchange(head, cur);
cur->next = prev;
_____________
There is a "window" in there where cur->next will be WAIT_STATE. So,
when a thread iterating the list, say after pop all, flush, it can do a couple of spins or do something else during iteration on a cur->next
being in WAIT_STATE. It's a way to get exchange on push and pop of a stack.
The window is very small. I am having trouble finding on this group
where I posted about it before in a working program...
On 5/28/25 18:34, Chris M. Thomasson wrote:
On 5/26/2025 2:58 PM, jseigh wrote:
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-
pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers >>> I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
Joe Seigh
Actually wrt:
_______________
hz_ptr.store(INDETERMINATE, std::memory_order_relaxed);
T* local = ptr.load(std::memory_order_relaxed);
hz_ptr.store(local, std::memory_order_relaxed);
_______________
For some damn reason, it kind of reminds me of an older way to use
only single word exchange to push into a lock-free stack.
Iirc, something like this pseudo code for push:
_____________
cur->next = WAIT_STATE;
node* prev = exchange(head, cur);
cur->next = prev;
_____________
There is a "window" in there where cur->next will be WAIT_STATE. So,
when a thread iterating the list, say after pop all, flush, it can do
a couple of spins or do something else during iteration on a cur->next
being in WAIT_STATE. It's a way to get exchange on push and pop of a
stack.
The window is very small. I am having trouble finding on this group
where I posted about it before in a working program...
Push may wait-free but pop isn't event lock-free.
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
Joe Seigh
On 5/28/25 18:34, Chris M. Thomasson wrote:
On 5/26/2025 2:58 PM, jseigh wrote:
and asymmetric memory barriers of course.
https://threadnought.wordpress.com/2025/05/26/wait-free-hazard-
pointers- using-std-atomics/
Looks like it runs faster without the conditional branch.
Not sure it's a new idea.ÿ The only wait-free version of hazard pointers >>> I've seen so far involves a storage to storage move instruction which
might not be available on all platforms.
Joe Seigh
Actually wrt:
_______________
hz_ptr.store(INDETERMINATE, std::memory_order_relaxed);
T* local = ptr.load(std::memory_order_relaxed);
hz_ptr.store(local, std::memory_order_relaxed);
_______________
For some damn reason, it kind of reminds me of an older way to use
only single word exchange to push into a lock-free stack.
Iirc, something like this pseudo code for push:
_____________
cur->next = WAIT_STATE;
node* prev = exchange(head, cur);
cur->next = prev;
_____________
There is a "window" in there where cur->next will be WAIT_STATE. So,
when a thread iterating the list, say after pop all, flush, it can do
a couple of spins or do something else during iteration on a cur->next
being in WAIT_STATE. It's a way to get exchange on push and pop of a
stack.
The window is very small. I am having trouble finding on this group
where I posted about it before in a working program...
Push may wait-free but pop isn't event lock-free.
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
Joe Seigh
On 5/29/2025 2:17 PM, jseigh wrote:>>>
Push may wait-free but pop isn't event lock-free.
Actually, I like my futex stack experiment as is. normal CAS for push,
SWAP for pop. It needs to have multiple stacks and distribute across
them because popping all from a single stack can mean a thread gets too
much work. Push lock-free, pop wait-free.
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
On 6/1/25 16:31, Chris M. Thomasson wrote:
On 5/29/2025 2:17 PM, jseigh wrote:>>>
Push may wait-free but pop isn't event lock-free.
Well you actually have a lot of leeway in concurrent queue semantics.
Actually, I like my futex stack experiment as is. normal CAS for push,
SWAP for pop. It needs to have multiple stacks and distribute across
them because popping all from a single stack can mean a thread gets
too much work. Push lock-free, pop wait-free.
Drop the FIFO requirements since they're not really observable for
all practical purposes and just concentrate on throughput forward
progress.
And wait-free is highly overrated. It seems to be important only in
theses, technical papers and obscure blogs, not so much is real world applications which is one reason I not going to implement it.
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
I did post on my blog how to improve reclaim forward progress for the wait-free hazard pointer so it's equivalent to a proxy collector.
On 6/2/2025 3:07 PM, jseigh wrote:
On 6/1/25 16:31, Chris M. Thomasson wrote:
On 5/29/2025 2:17 PM, jseigh wrote:>>>Well you actually have a lot of leeway in concurrent queue semantics.
Push may wait-free but pop isn't event lock-free.
Actually, I like my futex stack experiment as is. normal CAS for
push, SWAP for pop. It needs to have multiple stacks and distribute
across them because popping all from a single stack can mean a thread
gets too much work. Push lock-free, pop wait-free.
Drop the FIFO requirements since they're not really observable for
all practical purposes and just concentrate on throughput forward
progress.
And wait-free is highly overrated. It seems to be important only in
theses, technical papers and obscure blogs, not so much is real world
applications which is one reason I not going to implement it.
Perhaps. However, using a loop-free exchange is "faster" than a loop
based CAS? On Intel, XCHG vs a CMPXCHG loop?
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
I did post on my blog how to improve reclaim forward progress for the
wait-free hazard pointer so it's equivalent to a proxy collector.
On 6/2/2025 3:07 PM, jseigh wrote:
On 6/1/25 16:31, Chris M. Thomasson wrote:
On 5/29/2025 2:17 PM, jseigh wrote:>>>Well you actually have a lot of leeway in concurrent queue semantics.
Push may wait-free but pop isn't event lock-free.
Actually, I like my futex stack experiment as is. normal CAS for
push, SWAP for pop. It needs to have multiple stacks and distribute
across them because popping all from a single stack can mean a thread
gets too much work. Push lock-free, pop wait-free.
Drop the FIFO requirements since they're not really observable for
all practical purposes and just concentrate on throughput forward
progress.
And wait-free is highly overrated. It seems to be important only in
theses, technical papers and obscure blogs, not so much is real world
applications which is one reason I not going to implement it.
Perhaps. However, using a loop-free exchange is "faster" than a loop
based CAS? On Intel, XCHG vs a CMPXCHG loop?
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
I did post on my blog how to improve reclaim forward progress for the
wait-free hazard pointer so it's equivalent to a proxy collector.
On 6/3/25 15:09, Chris M. Thomasson wrote:
On 6/2/2025 3:07 PM, jseigh wrote:
On 6/1/25 16:31, Chris M. Thomasson wrote:
On 5/29/2025 2:17 PM, jseigh wrote:>>>Well you actually have a lot of leeway in concurrent queue semantics.
Push may wait-free but pop isn't event lock-free.
Actually, I like my futex stack experiment as is. normal CAS for
push, SWAP for pop. It needs to have multiple stacks and distribute
across them because popping all from a single stack can mean a
thread gets too much work. Push lock-free, pop wait-free.
Drop the FIFO requirements since they're not really observable for
all practical purposes and just concentrate on throughput forward
progress.
And wait-free is highly overrated. It seems to be important only in
theses, technical papers and obscure blogs, not so much is real world
applications which is one reason I not going to implement it.
Perhaps. However, using a loop-free exchange is "faster" than a loop
based CAS? On Intel, XCHG vs a CMPXCHG loop?
Anyway, after some consideration, I'm off on this
wait-free hazard pointer.ÿ The lock-free version is
more than performant.
I did post on my blog how to improve reclaim forward progress for the
wait-free hazard pointer so it's equivalent to a proxy collector.
I've been doing reader lock-free too long.ÿ I just realized how
trivially simple it is to do read write lock-free for almost any data structure.ÿ That's slightly embarrassing.
Sysop: | Tetrazocine |
---|---|
Location: | Melbourne, VIC, Australia |
Users: | 8 |
Nodes: | 8 (0 / 8) |
Uptime: | 115:23:39 |
Calls: | 161 |
Files: | 21,502 |
Messages: | 78,808 |