Hello,
I realize this is a long mail, I hope you will not get bored.
under the following constraints
- 1 writer which updates a struct from time to time (several msec) and is allowed to be blocked by readers
- several readers that must be able to read the struct as quickly as possible, concurrently, without being blocked by the writer
which kind of concurrent data structure would you recommend ?
Because of the constraints that the readers cannot be blocked, I imagined a kind of versioned data structure :
It goes like this
- there is an array of struct. Each struct has an associated ref count of active readers
- there is a pointer to the last updated version (actually I use an index in the array), let's call it the 'public version'
- when a reader wants to read, it accesses the 'public version', increments the ref count associated with this struct, reads the data, then decrement it when it is done
- when the writer wants to publish a new version, it gets the next struct in the array (using the next index after the public index plus some modulo operation), verify there is no active readers, update the struct, then make it the public version
(it sounds to me a little bit like RCU, but I'm not familiar with RCU)
It works well if the writer infrequently updates the struct (because the readers never lag on a previously updated struct).
If I choose the number of versions to be greater than the number of readers, then the writer is guaranteed to always find an empty slot, and never block : it is also wait-free.
Now I'm afraid that if I 'release' this data structure in my team, even with all warnings on, somebody is going to reuse it with a writer in a spin loop, and that will corrupt the data. (there is a race when checking the refcount of the readers)
So I imagined this little variation, where readers use a kind of mix between a TicketLock and a SeqLock to allow the writer to detect when all readers are quiescent.
It goes like this (pseudo-code)
struct QuiescentMonitor
{
atomic<int> seq1_;
atomic<int> seq2_;
void read_section_enter()
{
seq1_++;
}
void read_section_leave()
{
seq2_++;
}
int write_section_try_enter()
{
int seq1;
for (;;)
{
seq1 = seq1_;
int seq2 = seq2_;
if (seq1 == seq2) break;
}
return seq1;
}
bool write_section_try_leave(int oldseq)
{
int seq1 = seq1_;
return (seq1 == oldseq);
}
};
The versioned data struct is now like (pseudo code)
template <typename T, int N>
struct VersionedData
{
struct DataWrapper
{
T data;
std::atomic<int> activeReadersRefCount;
};
DataWrapper data[N];
std::atomic<int> publicVersion { 0 };
QuiescentMonitor readerMonitor;
};
The readers now go like this:
readerMonitor.read_section_enter();
int version = publicVersion;
DataWrapper& publicData = data[version];
publicData.activeReadersRefCount++;
readerMonitor.read_section_leave();
[ read the data ]
publicData.activeReadersRefCount--;
And the writer like this :
int nextVersion = publicVersion + 1; // plus modulo operation
for (;;) // version loop
{
for (;;) // retry loop for quiescent detection
{
int seq = readerMonitor.write_section_try_enter();
DataWrapper& privateData = data[nextVersion];
if (privateData.activeReadersRefCount != 0)
{
// readers still active : try next version
break;
}
if (readerMonitor.write_section_try_leave(seq))
{
// we were able to read the ref count of this struct while all the readers were quiescent, and the refCount is 0.
// this means that no readers is lagging on this version, nor concurrently trying to read it
// we're also guaranteed readers will access publicVersion next time they issue a read, and definitively not this particular private version
[mutate data]
// make this version public
publicVersion = nextVersion;
return;
}
// else readers tried to access something, maybe they tried to access this particular privateVersion, maybe the publicVersion, so my read of 0 refCount might be invalid
// let's retry
}
// try next version
nextVersion += 1; // plus modulo operation, plus check that nextVersion is not pointing at publicVersion again
}
Although it looks like it works, I now have one fear : that the writer live locks if a reader keeps on reading in a tight loop.
So I'm trying to find a way around this. Any suggestion ? (or comments on the correctness so far)
With regards
David