Synchronising background streaming threads

JoshG

unread,

Feb 27, 2009, 2:01:52 PM2/27/09

to

Hey guys,

I have what I think is a common problem, but I'm not sure how to solve
it. (its been a while since I had to work with threads).

The scenario is slightly complex...

I have 2 threads. They operate in a "main thread" and a "background
thread" kind of fashion. The background thread is used for streaming
to/from the disk.

As such, I have a certain point in the code, where the background
thread has to wait for the main thread to be at a certain point in the
code. This has been done by a dual semaphore handshake (background
thread sets a flag, waits on a semaphore, update thread sees the flag,
signals that semaphore and waits on a different one, then once the
background thread is done it signals that semaphore). Simple right?
(if that has flaws in it please let me know!)

But now I have a situation, where a cache that is used by the
backgound thread needs to be updated in the main thread. I was going
to do this by applying the use of a lock around the different uses of
the cache. But before I release the cache in the background thread, I
need to synchronize with the main thread (which may either be blocking
on using the cache, or just running as normal). Now you might see that
I will have a deadlock here if the update thread is already blocking
on using the cache (because it will never get the chance to signal
semaphores and indicate it is ready to make changes).

So, I thought what if I had some sort of status flag, that would tell
me before attempting to lock the cache that the thread will have to
block for some time (in this case, it will know it needs to signal the
semaphores before obtaining the locks).

So I came up with the following code:

BeginUpdateOfCache()
{
if (CacheLocked)
{
semaphore.signal()
resume.wait()
}
cache.lock()
}

BeginReadOfCache()
{
CacheLocked = true
cache.lock()
}

But that will still deadlock, because what if CacheLocked = false, the
test falls through, then the context switches to the other thread
which sets CacheLocked to true, and locks the lock, then another
context switch to the other one and it tries to lock - but can't...
Now the semaphores will never be signaled and I have deadlock
again....

So I thought of doing this:

BeginUpdateOfCache()
{
CacheStatus.lock()
if (CacheLocked)
{
semaphore.signal()
resume.wait()
}
CacheStatus.unlock()
cache.lock()
}

BeginReadOfCache()
{
CacheStatus.lock()
CacheLocked = true
cache.lock()
CacheStatus.unlock()
}

But I always thought that locking/unlocking resources out of order was
bad practice!

Does someone understand what I'm trying to do here? Can you offer any
suggestions as to what I should/shouldn't be doing?

Thanks all!

Josh

David Schwartz

unread,

Feb 27, 2009, 2:50:36 PM2/27/09

to

On Feb 27, 11:01 am, JoshG <Inbi...@gmail.com> wrote:

> But now I have a situation, where a cache that is used by the
> backgound thread needs to be updated in the main thread. I was going
> to do this by applying the use of a lock around the different uses of
> the cache. But before I release the cache in the background thread, I
> need to synchronize with the main thread (which may either be blocking
> on using the cache, or just running as normal). Now you might see that
> I will have a deadlock here if the update thread is already blocking
> on using the cache (because it will never get the chance to signal
> semaphores and indicate it is ready to make changes).

Synchronize after you release the cache. You should only hold a lock
on the cache while you are accessing or modifying it. You should
release it as soon as you are doing *anything* else.

DS

JoshG

unread,

Feb 27, 2009, 3:37:42 PM2/27/09

to

> Synchronize after you release the cache. You should only hold a lock
> on the cache while you are accessing or modifying it. You should
> release it as soon as you are doing *anything* else.
>
> DS

Thanks for you response,

The trouble with this is essentially that the Background thread is
updating a data structure that the main thread uses. The background
thread locks the cache while it uses that cache to bring all the newly
created objects in sync with the current data in the application.
Whilst this happens, it is important that the application data doesn't
change (Hence locking the cache). The current synchronization point is
where the background thread releases these new objects to the main
thread for standard use. If I were to release the cache before this
synchronization, then the main thread would update all its objects
with new data, and the new objects would never get this new data...

Does that make sense?
Thanks for the suggestion, perhaps the entire situation I have could
be redesigned, and if you guys have any suggestions on how I should do
that, I'm all ears!
Thanks again!

Josh

Chris M. Thomasson

unread,

Feb 27, 2009, 4:04:17 PM2/27/09

to

"JoshG" <Inb...@gmail.com> wrote in message
news:2d752d0c-9ad6-42bb...@j35g2000yqh.googlegroups.com...

> Hey guys,
>
> I have what I think is a common problem, but I'm not sure how to solve
> it. (its been a while since I had to work with threads).
>
> The scenario is slightly complex...
>
> I have 2 threads. They operate in a "main thread" and a "background
> thread" kind of fashion. The background thread is used for streaming
> to/from the disk.
>
> As such, I have a certain point in the code, where the background
> thread has to wait for the main thread to be at a certain point in the
> code. This has been done by a dual semaphore handshake (background
> thread sets a flag, waits on a semaphore, update thread sees the flag,
> signals that semaphore and waits on a different one, then once the
> background thread is done it signals that semaphore). Simple right?
> (if that has flaws in it please let me know!)
>
> But now I have a situation, where a cache that is used by the
> backgound thread needs to be updated in the main thread. I was going
> to do this by applying the use of a lock around the different uses of
> the cache. But before I release the cache in the background thread, I
> need to synchronize with the main thread (which may either be blocking
> on using the cache, or just running as normal). Now you might see that
> I will have a deadlock here if the update thread is already blocking
> on using the cache (because it will never get the chance to signal
> semaphores and indicate it is ready to make changes).

> [...]

Well, since you have only two threads, you really don't "need" a lock on the
cache itself. What about something like <pseudo-code>:
________________________________________________________________
static semaphore ready; // initial value of 0
static semaphore finished; // initial value of 1
static cache the_cache;

void background_thread() {
for (;;) {
ready.wait();
the_cache.do_background_work();
finished.signal();
}
}

void main_thread() {
for (;;) {
finished.wait();
the_cache.do_main_thread_work();
ready.signal();
}
}
________________________________________________________________

This insures that only one thread will ever be working with `the_cache' at
any one time.

Chris M. Thomasson

unread,

Feb 27, 2009, 4:23:18 PM2/27/09

to

simple example program:
______________________________________________________________________
#include <pthread.h>
#include <semaphore.h>
#include <cerrno>
#include <cstdio>
#include <cassert>

class semaphore {
sem_t m_handle;

public:
semaphore(unsigned value = 0) {
sem_init(&m_handle, 0, value);
}

void signal() {
sem_post(&m_handle);
}

void wait() {
while (sem_wait(&m_handle) == EINTR);
}
};

#define ITERATIONS 99999U

static semaphore g_ready, g_finished(1);
static unsigned g_cache = 0;

void* background_thread(void* state) {
unsigned cache = 0;
do {
g_ready.wait();
cache = g_cache++;
std::printf("background_thread observed %u - updated %u\n",
cache, g_cache);
g_finished.signal();
} while (cache < ITERATIONS);
return NULL;
}

void* main_thread(void* state) {
unsigned cache = (unsigned)-1;
do {
g_finished.wait();
assert(g_cache == cache + 1);
cache = g_cache++;
std::printf("main_thread observed %u - updated %u\n",
cache, g_cache);
++cache;
g_ready.signal();
} while (cache < ITERATIONS);
return NULL;
}

int main() {
pthread_t tid[2];

pthread_create(&tid[0], NULL, background_thread, NULL);
pthread_create(&tid[1], NULL, main_thread, NULL);

pthread_join(tid[0], NULL);
pthread_join(tid[1], NULL);

//--------------------------------------------------------------
std::puts("\n\n\n____________________________________________"
"____\npress <ENTER> to exit...");
std::fflush(stdin);
std::fflush(stdout);
std::getchar();
return 0;
}
______________________________________________________________________

No mutex lock is needed on the global variable `g_cache'. The call to
`assert()' in the `main_thread' checks for coherency.

Any thoughts?

David Schwartz

unread,

Feb 27, 2009, 5:03:56 PM2/27/09

to

On Feb 27, 12:37 pm, JoshG <Inbi...@gmail.com> wrote:

> The trouble with this is essentially that the Background thread is
> updating a data structure that the main thread uses.

That's why you need to lock the structure while you're reading or
modifying it.

> The background
> thread locks the cache while it uses that cache to bring all the newly
> created objects in sync with the current data in the application.
> Whilst this happens, it is important that the application data doesn't
> change (Hence locking the cache). The current synchronization point is
> where the background thread releases these new objects to the main
> thread for standard use. If I were to release the cache before this
> synchronization, then the main thread would update all its objects
> with new data, and the new objects would never get this new data...

Why would it do that? I thought you had synchronized the threads. It
sounds like you are trying to use locks for synchronization.

The update thread shouldn't write over the data the other thread
needs. This is not because of any lock, it's because that would be a
broken application logic.

> Does that make sense?

No. It makes no sense. Why would one thread, even if it could get a
lock, overwrite data that it knows the application still needs? That
would be wholly broken code, lock or no lock.

The lock prevents two threads from racing on the same data. But a
thread still has to know what to do once it gets the lock.

It sounds like your threads are at war rather than cooperating.

> Thanks for the suggestion, perhaps the entire situation I have could
> be redesigned, and if you guys have any suggestions on how I should do
> that, I'm all ears!

A thread should acquire a lock to do something it knows needs to be
done or to check whether something needs to be done. It should expect
to acquire the lock regardless of the application state.

DS

JoshG

unread,

Feb 27, 2009, 9:52:37 PM2/27/09

to

Sorry David,

I've obviously failed to explain myself well enough.
I'll try again if you'll oblige me. Perhaps even explaining things
properly will be enough to get me by!
I'll explain the problem I need to solve, without explaining what I've
attempted (its explained above anyway)
And perhaps you guys can point me to a solution that I've missed.

1) I have a list of objects.
2) I have 2 threads. Main thread, and Background thread.
3) Periodically these objects are queried (By the main thread)
4) Sporadically, these objects are updated with data from the
application (By the main thread)
5) Sporadically, new objects are added to the list (By the background
thread).

Query results from point 3 involve returning objects from the list.
These objects must remain valid until the beginning of the next
query.

When new objects are initialised in point 5, they must be initialised
with the latest aplication's data (from point 4). This is referred to
as the "Cache".
Objects are added to the list, then initialised. This is because the
list's data structure involves part of the initialisation (I cannot
change this). The Main thread is prevented from accessing these 'half
initialized' objects by a barrier value that specifies the last
'completely intiialised' object.

When initialization of new objects is complete, the background thread
waits until the beginning of the next query before adjusting the value
of the "last completely intiialised object". This prevents the query
results from having inconsistent data.

Restating the problem:
So, I have a dual semaphore handshake to detect the beginning of a
query (where I update the valid length of the list).
And I have a lock around the application data cache to prevent the
application from changing the current data whilst I am initialising
the new objects (this way the objects will be initialised with soon to
become obsolete (but current) data, then the main thread will come
through and give everything fresh data).

But I need to set the "last completely initialized object" value
before unlocking the cache. Otherwise the Main thread will be
released, update the cache and loaded objects but will still be unable
to update these new objects with the new data (because the last
completely intiialized object value hasn't been updated yet). But I
can't change the last completely initialized object inside the lock
because it would wait for the beginning of a query, and the main
thread is currently blocked waiting for the cache to unlock. DEADLOCK.

Was that clearer this time round?

Josh

Chris M. Thomasson

unread,

Feb 27, 2009, 9:54:47 PM2/27/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:WPYpl.47686$cI2....@newsfe09.iad...
> simple example program:
> ______________________________________________________________________

Of Course!

> [...]
> void* background_thread(void* state) {

extern "C" void* background_thread(void* state) {

> [...]
> }
>
>
extern "C" void* main_thread(void* state) {
> [...]
> }
> [...]

Yikes! What a bone headed mistake. Your calling functions from
`pthread_create()', well, they MUST have C calling convention.

Chris M. Thomasson

unread,

Feb 27, 2009, 10:13:05 PM2/27/09

to

"JoshG" <Inb...@gmail.com> wrote in message

news:320f1464-f38b-4a0c...@x13g2000yqf.googlegroups.com...

> Sorry David,
>
> I've obviously failed to explain myself well enough.

> [...]

> Was that clearer this time round?

I think so. Well, I need to think on this before gibing any advise. Sorry.
Did you recently take control of this mess or what? Can you use condition
variables?

JoshG

unread,

Feb 28, 2009, 7:08:02 AM2/28/09

to

> I think so. Well, I need to think on this before gibing any advise. Sorry.
> Did you recently take control of this mess or what? Can you use condition
> variables?

Thanks for looking into this Chris,
I haven't just taken control of this, though the code is a library
that is used by another library, hence some bits I can change and
others I can't. I can change the way objects are stored in the list,
and the list structure itself, and what the threads do, but I can't
change the objects that are returned in the queries. Unfortunately the
requirement for this library to be threaded in this way has been
imposed recently (got to love changing specifications!).

As for condition variables, I do not have direct access to these
primitives, I have direct access to CritSections, and Semaphores. I
could probably implement my own condition variables in terms of those
primitives though.

If you have any suggestions on at all for how it might be setup better
I'd be willing to hear them. I'm not ashamed to say I've not touched
threading very much for about 3 years now, so the best practices in
these situations are still beyond me at this stage.

Josh