I haven't found a better place to post this, so if this isn't the right place, please point me where I should go...
I'm using Relacy 2.3 to test one of my algorithms, and I'm running into a little problem using the fair_full_search_scheduler_type on one of my tests - and I'm not having the same problem with the default scheduler. Inside rand_impl, the assertion RL_VERIFY(n.count_ == limit); fails - apparently, n.count_ is one higher than limit. Going up the stack, it looks like the running_thread_count is one below the number of threads in the test: the test should run with 64 threads but there are only 63 running. (I get the same thing with 6 threads, btw.) I'm a bit confused as to how that can happen...
I've posted the stack trace below. As you can see I'm in a WaitForSingleObject with infinite time-out (some other thread will eventually wake me up). As far as I can tell, relacy is trying to decide which thread (or fiber, rather) to run next and attempts to pick one randomly. It picks an entry from stree_, which at this time has size 17662 (stree_depth is 2). The tree is filled with values that look like: { 64, 0, sched_type_sched, 0 }.
The 63 comes from the running_threads_count variable, which is at 63 at the time rand_impl is called. That's because it was decremented in the block_thread function, working for the WaitForSingleObject. And that's what I don't understand: why an assertion that basically says "all possible threads must be running" while it is obviously possible that all possible threads aren't running? Should that assertion say RL_VERIFY(n.count_ >= limit) (i.e. "no more than all possible threads are running) and, if index_ > limit, just loop so it can try again?
If that looks like the right solution to you, I'll be happy to provide a patch (against the current SVN, if preferred) that does that.
Thanks,
rlc
Stack trace follows:
RelacyTests.exe!rl::tree_search_scheduler<rl::full_search_scheduler<64>,rl::tree_search_scheduler_thread_info<64>,64>::rand_impl(unsigned int limit=63, rl::sched_type t=sched_type_sched) Line 275 + 0xc bytes C++
RelacyTests.exe!rl::scheduler<rl::full_search_scheduler<64>,rl::tree_search_scheduler_thread_info<64>,64>::rand(unsigned int limit=63, rl::sched_type t=sched_type_sched) Line 134 C++
> RelacyTests.exe!rl::tree_search_scheduler<rl::full_search_scheduler<64>,rl::tree_search_scheduler_thread_info<64>,64>::schedule_impl(rl::unpark_reason & reason=unpark_reason_normal, unsigned int yield=1) Line 219 + 0x10 bytes C++
RelacyTests.exe!rl::scheduler<rl::full_search_scheduler<64>,rl::tree_search_scheduler_thread_info<64>,64>::schedule(rl::unpark_reason & reason=unpark_reason_normal, unsigned int yield=1) Line 121 + 0x17 bytes C++
RelacyTests.exe!rl::context_impl<ClockTest,rl::full_search_scheduler<64> >::schedule(unsigned int yield=1) Line 545 + 0x16 bytes C++
RelacyTests.exe!rl::context_impl<ClockTest,rl::full_search_scheduler<64> >::park_current_thread(bool is_timed=false, bool allow_spurious_wakeup=false, const rl::debug_info & info={...}) Line 370 C++
RelacyTests.exe!rl::waitset<64>::park_current(rl::context & c={...}, bool is_timed=false, bool allow_spurious_wakeup=false, const rl::debug_info & info={...}) Line 41 + 0x1d bytes C++
RelacyTests.exe!rl::event_data_impl<64>::wait(bool try_wait=false, bool is_timed=false, const rl::debug_info & info={...}) Line 240 + 0x1a bytes C++
RelacyTests.exe!rl::generic_event::wait(bool try_wait=false, bool is_timed=false, const rl::debug_info & info={...}) Line 349 + 0x25 bytes C++
RelacyTests.exe!rl::rl_WaitForSingleObjectEx(rl::win_object * obj=0x00b1d868, unsigned long timeout=4294967295, int alertable=0, const rl::debug_info & info={...}) Line 56 + 0x1d bytes C++
RelacyTests.exe!rl::rl_WaitForSingleObject(rl::win_object * obj=0x00b1d868, unsigned long timeout=4294967295, const rl::debug_info & info={...}) Line 69 + 0x13 bytes C++