I am working on a designing a Thread Manager to run n number of
Worker threads in batch mode.
The Manager runs continuously and runs next n number of Worker
threads when all threads in previous batch return.
I am using condition variable and signals to synchronize the whole
project.
My design:-
Manager:-
int worker_count= 0
pthread_mutex_lock(worker_mutex)
pthread_create -> all_workers in batch(worker_count updated to n)
worker_stopped = 0
for(;;)
{
while(worker_stopped == 0)
pthread_cond_wait(worker_mutex, worker_stop_cond)
pthread_join(worker_id, arg)
worker_count--
if(worker_count == 0)
pthread_create -> all_workers in next batch(worker_count
updated to n)
worker_stopped=0
}
pthread_mutex_unlock(worker_mutex)
Worker "i" :-
Do some work
pthread_mutex_lock(worker_mutex)
worker_stopped = pthread_self()
pthread_cond_signal(worker_stop_cond)
pthread_mutex_unlock(worker_mutex)
pthread_exit(arg)
The problem is I am loosing some signals which leads to a deadlock(The
manager just hangs in waiting condition even after all workers have
returned).
I debugged the program and it looks like the problem occurs if say
worker "B" locks the "worker_mutex" after worker "A" updated
"worker_stopped" and unlocked "worker_mutex" but before Manager could
lock the "worker_mutex"
I looks like a race condition between manager and other set of stopped
worker to grab mutex after one worker released the mutex.
One solution I could think of it to make the stopped worker signal
only when "worker_stopped" is 0 but is there a better solution to the
problem.
I am not sure if my analyzes is correct and wanted to take opinion of
the community.
I also wanted to know is it "OK" to lock the mutex before launching
the threads. I am doing it so that no thread exits and signals before
the manager calls pthread_cond_wait. Is there a better way to
implement this sync.
Looking forward for some helping solutions.
Thanks
> I am working on a designing a Thread Manager to run n number of
> Worker threads in batch mode.
> (...)
> One solution I could think of it to make the stopped worker signal
> only when "worker_stopped" is 0
Yes, that's right. A worker deposits a value there, and then _some_
threads gets the mutex and either reads the value (if the main thread)
or overwrites (if another worker).
So the worker should pthread_cond_wait() for the main thread to
pthread_signal() that it has set worker_stopped = 0.
Note that pthread_join() takes a value of type pthread_t, which could be
a struct. So your
worker_stopped = pthread_self()
should be
worker_id = pthread_self()
worked_stopped = 1
However...
It doesn't look like you need your cond and mutex at all. Just do this,
plus error checking:
pthread_t *tids = malloc(n * sizeof(pthread_t));
while (more work to do) {
for (i = 0; i < n; i++)
pthread_create(&tids[i], ...);
while (i != 0)
pthread_join(tids[--i], ...);
}
free(tids);
Though why do you stop and restart threads at all? Maybe a thread pool
would be better: The thread functions loop and pick up tasks which you
post to a task queue. If a task must not be started until all tasks
in previous batch are done, use pthread_barrier_wait().
--
Hallvard
spool...@gmail.com writes:
> I also wanted to know is it "OK" to lock the mutex before launching
> the threads. I am doing it so that no thread exits and signals before
> the manager calls pthread_cond_wait.
It's OK, but as you've seen it is not sufficient to implement the sync
you need. Once the main thread has done pthread_cond_wait() which
releases the mutex, any or all the worker threads might get the mutex
and exit before the main thread gets scheduled again.
Another possibility is to detach the worker threads. See pthread_detach
or pthread_attr_init + pthread_attr_getdetachstate. The worker threads
can decrement worker_count before exiting, and the thread decrementing
it to zero can signal the main thread.
Likely not such a good idea if you immediately launch a new batch of
threads though, at least not if memory (stack space) is limitation.
Sooner or later, at time you create the new threads, a bunch of old
worker threads will still be alive and thus take up stack space.
(They'd be sitting just before their pthread_exit call.)
--
Hallvard
Thanks Hallvard for the comments.
The reason I am using condition variable and signals is because in
future the Thread manager will have to implement parallel mode and
sequential mode along with the batch mode currently supported. I
figured out I will help me get good control over the all the threads
especially in parallel mode.
Also it gives me control that no thread exit before the manager calls
pthread_join. Could you please enlighten me on where is the
pthread_exit() return value stored? I guess it would be lost if the
thread calls pthread_exit before the manager calls pthread_join.
Correct me if I am wrong.
Thread pool looks like a good idea. Although memory is not an issue, I
would like the application to be efficient.
Appreciate your time.
> Also it gives me control that no thread exit before the manager calls
> pthread_join.
You want the thread to wait for the manager so that the manager can
wait for the thread? That doesn't make any sense.
> Could you please enlighten me on where is the
> pthread_exit() return value stored?
Somewhere internally to the implementation.
> I guess it would be lost if the
> thread calls pthread_exit before the manager calls pthread_join.
> Correct me if I am wrong.
No, it would be stored. But there's no reason you should care. You
should care about the completion status of particular units of work,
not the vehicles that do them.
If FedEx is delivering a package for you, you care whether or not the
package got to its destination. But you shouldn't care about the
status of the truck that happened to carry it to its destination. It
sounds like you're crossing layers of abstraction.
> Thread pool looks like a good idea. Although memory is not an issue, I
> would like the application to be efficient.
Thread pools save the overhead of creating, destroying, and
synchronizing threads. Plus, it decreases context switches as whatever
thread happens to already be running can take the next job.
DS
What I basically want is to collect return status of each thread that
the Manager scheduled earlier so that I can schedule new task or may
be repeat the same task depending upon certain calculations.
Manager does not know which thread will exit first hence I can not
specify pthread_join on a certain thread_id. After scheduling all
threads, the manager waits on a condition variable until it receives a
signal from terminating thread and then collects the return status of
that particular thread with pthread_join(condition variable gives me
id of terminating thread).
The manager does calculation after it collects return status from one
thread and before it goes in loop to collect status from next thread.
The time it will take to finish this calculation can vary. While the
manager is calculating, I do not want another thread to exit as I
doubt the return status from pthread_exit of terminating thread will
be lost because of time delay.
Let's take a scenario:-
Manager schedules two thread A & B
Both A & B finish at almost same time and call pthread_exit
Manager does pthread_join on B and starts some calculation
Lets say calculations take 5 minutes.
Now Manager calls pthread_join on thread A >>>> Will the return status
of A still be preserved???
If yes, I would definitely like to know where FedEx stored my package
for all 5 minutes :) (Not that I have to implement something based on
it but it's just my curiosity)
Thanks
1 Manager
10 Workers
Manager room has 10 doors. One for each worker.
Each door has a "Available/Busy" sign on it.
Initial sign on all doors is "Busy"
Manager assigns 10 workers some assignment and kicks them out of his
room.
After kicking all workers out of his room, the sign on each door is
changed to "Available"
Workers do their work and rush back to Manager room. Workers look at
the door sign before they enter the room. If "Busy" they just wait
outside.
As soon as first worker enters, the door sign on all doors is changed
to "Busy"(Mutex help atomicity) so that no other worker enters.
Managers goes through worker's finished assignment and assigns him a
new one and kicks him out of the room again.(This task might change
depending upon sequential, batch or parallel scheduling mode)
As soon as the current worker goes out of the room, the door sign on
all doors is changed to "Available"
The cycle continues...
Let me see if I can model something here based on your initial info... This
will be in C++/C'ish pseudo-code so please try to bear with me here; BTW,
the `worker_stopped' variable introduces a massive race-condition in your
pseudo-code... Anyway, here I go:
________________________________________________________________
#define WORKER_COUNT 10
struct worker_message {
pthread_t tid;
whatever result;
};
struct worker_manager {
queue<worker_message*> queue;
int count; // = 0;
mutex mutex;
condvar cond;
};
static worker_manager manager;
static void* worker_entry(void*);
void worker_manager_daemon() {
mutex::guard lock(manager.mutex);
spawn_workers:
for (int i = 0; i < WORKER_COUNT; ++i) {
pthread_t tid;
pthread_create(&tid, NULL, worker_entry, NULL);
++manager.count;
}
// daemon loop...
for (;;) {
worker_message* wmsg;
while (! (wmsg = manager.queue.trypop())) {
manager.cond.wait(lock);
}
pthread_join(wmsg->tid, NULL);
// act on `wmsg->result'...
delete wmsg;
--manager.count
if (! manager.count) {
goto spawn_workers;
}
}
}
void* worker_entry(void* arg) {
{
// do some work
}
worker_message* wmsg = new worker_message;
wmsg->tid = pthread_self();
wmsg->result = /* whatever */;
{
mutex::guard lock(manager.mutex);
manager.queue.push(wmsg);
}
manager.cond.signal();
return 0;
}
________________________________________________________________
AFAICT, that does what your initial pseudo-code requires... Please correct
me if I am wrong.
Thank you.
BTW, if you want me to create this in the form of fully compliable code,
just ask!
:^D
Yes.
> If yes, I would definitely like to know where FedEx stored my package
> for all 5 minutes :)
Who cares? It's the OS's job. To be able to implement pthread_join(),
it must remember a thread's ID and pthread_exit() value until you
pthread_join() it, unless the thread was detached. Just like fork()'s
pid and exit status stay around until you wait() for it, unless you have
"detached" it with signal(SIGCHLD, SIG_IGN) if I remember correctly.
--
Hallvard
This is a classic bad design pattern. What you really want is the
first available worker to do a job. But you force the manager to
assign it to a particular worker, whether that's the first available
or not.
As a result, workers can never do two jobs without a manager
intervening, noticing that they're not busy, and then assigning them a
job. This forces workers to go through an extended "done with one job
waiting for another" wherein they must be descheduled and rescheduled.
Worse, it forces the association between jobs and threads into higher
levels of the code than they belong. If I ask FedEx to send a package,
I want to track that package, not the truck it happens to be on.
Analogously, the managers should be tracking *jobs* not threads.
DS
Thanks Chris, that's pretty much what I am doing in my application.
But how do I get to know thread_id on which to do pthread_join first.
There is no pattern in which the threads will terminate. I do no want
to keep waiting on 1st thread to terminate if thread 4th has already
terminated.
That's more like a requirement of my application. The sole purpose of
Manager being there is to have strict control on which thread gets
which job assigned and in what pattern(sequential, batch or parallel).
Thanks all.
For reference, I sent a private reply to this since I got a private
copy of this message. Summary - there is no pthread_join(any thread)
function. So let the worker threads detach, store the result in the
submitted task (from manager), and submit the task back to a result
queue which the manager thread reads.
--
Hallvard
In what way are the threads different, so that it matters?
Are they created with different attributes? Does something
remember and make use of their thread IDs?
> and in what pattern(sequential, batch or parallel).
--
Hallvard
Sorry for that. I accidently hit the "reply to author". Needs some
caffeine :(
Threads are not different in terms of pthread_create attributes.
It's just that Manager decides which task will be assigned next based
on certain parameters which includes the total time taken by a
particular task(for which I need the thread to notify me that it has
stopped and Manager can do it's calculation).
I can not assign a predefined queue and let workers take a first task
in the queue. The order of tasks can be shuffled and that decision is
made only after one task gives its results back to the Manager which
in turn decides the next task to be scheduled.
That is the reason I need each thread to signal stop and wait for it's
next task assigned by Manager instead of picking it up own its own.
Also Manager needs to maintain a log of all its scheduling activity.
I am looking forward to implement the idea of reusing the same thread
to do another task assigned by Manager instead of destroying and
creating a new thread every time.
> Thanks Chris, that's pretty much what I am doing in my application.
Okay. I take it that you have resolved the race-condition in the code wrt
the `worker_stopped' and `worker_id' variables right? The sample pseudo-code
I posted is race-free in that respect because it uses a queue to transmit
thread shutdown events which contain the exact tid's to the manager
directly...
> I am looking forward to implement the idea of reusing the same thread
> to do another task assigned by Manager instead of destroying and
> creating a new thread every time.
This can be fairly trivially implemented as follows:
<more pseudo-code! ;^)>
________________________________________________________________
#define WORKER_COUNT 10
struct work_item {
task_type task;
result_type result;
};
struct worker {
work_item work;
bool resume;
mutex mutex;
condvar cond;
};
struct worker_manager {
queue<worker*> queue;
mutex mutex;
condvar cond;
};
static worker_manager manager;
static void* worker_entry(void*);
void worker_manager_daemon() {
worker* w;
// spawn workers
for (int i = 0; i < WORKER_COUNT; ++i) {
pthread_t tid;
w = new worker;
w->resume = true;
w->work.task = /* init task */;
w->work.result = /* in progress */;
pthread_create(&tid, NULL, worker_entry, w);
}
// daemon loop...
for (;;) {
{
mutex::guard lock(manager.mutex);
while (! (w = manager.queue.trypop())) {
manager.cond.wait(lock);
}
}
/* review result in `w->work.result' */;
w->work.task = /* set next task */;
w->work.result = /* in progress */;
{
mutex::guard lock(w->mutex);
w->resume = true;
}
w->cond.signal();
}
}
void* worker_entry(void* arg) {
worker* const this_worker = arg;
for (;;) {
{
/* peform task in `this_worker->work.task';
/* set result of task in `this_worker->work.result';
}
{
mutex::guard lock(manager.mutex);
this_worker->resume = false;
manager.queue.push(this_worker);
}
manager.cond.signal();
mutex::guard lock(this_worker->mutex);
while (! this_worker->result) {
this_worker->cond.wait(lock);
}
}
return 0;
}
________________________________________________________________
The thread shutdown procedure is not shown here, however, its relatively
simple. Does that do what you want? Or, do you need all spawned threads to
finish and await resumption _before_ you assign any new tasks? That can be
trivially designed as well...
> It's just that Manager decides which task will be assigned next based
> on certain parameters which includes the total time taken by a
> particular task(for which I need the thread to notify me that it has
> stopped and Manager can do it's calculation).
That's fine. But you are thinking "manager thread" when what you mean
is "manager logic". The logic is the same regardless of what thread
runs it.
> I can not assign a predefined queue and let workers take a first task
> in the queue. The order of tasks can be shuffled and that decision is
> made only after one task gives its results back to the Manager which
> in turn decides the next task to be scheduled.
So when a thread is done, have it directly call the manager code to
decide what to do next. For that instant, that thread becomes the
"manager thread".
You seem stuck on the idea that logical function elements of the code
must be associated with particular threads. This is rarely a smart
design pattern.
You have some "manager" code. Any time a thread realizes that some
manager work needs to be done, it should call into the manager code to
do some managing. No reason to switch to the thread that did some
manager work before.
> That is the reason I need each thread to signal stop and wait for it's
> next task assigned by Manager instead of picking it up own its own.
> Also Manager needs to maintain a log of all its scheduling activity.
Fine, so when a thread finishes a job, it becomes the manager. It can
log its activity, do whatever it has to do, and get assigned the next
job. Then it stops being the manager.
There is no reason one thread must always be the manager and many
reasons not to do things this way.
> I am looking forward to implement the idea of reusing the same thread
> to do another task assigned by Manager instead of destroying and
> creating a new thread every time.
At the same time, eliminate the restriction that the manager logic
always run in the same thread. That's pointless and forces context
switches.
DS
Good advise!
Thanks, got your point. I should have explained the actual application
in a bit more detail in my first post only.
The reason I call that a Manager thread is because we actually need a
separate manager in our application. We specifically designed our
threads to be dumb and do only the task they are scheduled to do.
The application is designed to work on prototype hardware. The worker
threads run a set of diagnostics on a hardware component which might
be unstable and can cause the thread to hang or go in long I/O wait.
It is not unlikely that all the workers threads will be stuck in long
I/O wait. If that happens no one will execute the Manager code and the
application will hang. We need the application to be interactive at
all times and return back if the user wants or in case of timeout.
Hence the manager comes in. The Manager does not mess with prototype
hardware and hence is unlikely to go in I/O wait or hang. Also manager
has some Manager specific data. We would not want that data to be
global and shared with all threads in the application.
The above design is our initial design and was decided before reading
all the comments in this thread. I am going through the process of
redesigning and would incorporate the possible changes mentioned in
this thread.
Appreciate your help.
> Thanks, got your point. I should have explained the actual application
> in a bit more detail in my first post only.
>
> The reason I call that a Manager thread is because we actually need a
> separate manager in our application. We specifically designed our
> threads to be dumb and do only the task they are scheduled to do.
That really doesn't matter. Really.
You wrote some code. The code doesn't care what thread happens to run
it.
> The application is designed to work on prototype hardware. The worker
> threads run a set of diagnostics on a hardware component which might
> be unstable and can cause the thread to hang or go in long I/O wait.
This is a good reason to use threads. Threads can block in I/O for as
long as needed and the process can go on its merry way.
> It is not unlikely that all the workers threads will be stuck in long
> I/O wait. If that happens no one will execute the Manager code and the
> application will hang. We need the application to be interactive at
> all times and return back if the user wants or in case of timeout.
This means you need a timer. If you want, have a "timer thread" to run
the hang check code periodically. It doesn't in any way force you to
switch to a particular thread every time some other thread finishes a
task.
Running periodically to check for stuck *TASKS* (not stuck threads,
you don't care that a particular thread is stuck -- you care that a
particular operation is stuck) is just one of the things the manager
code needs to do, regardless of what thread happens to be running it.
> Hence the manager comes in. The Manager does not mess with prototype
> hardware and hence is unlikely to go in I/O wait or hang. Also manager
> has some Manager specific data. We would not want that data to be
> global and shared with all threads in the application.
You don't understand threads. Sorry to be so blunt, but you don't.
Threads are execution vehicles that cooperate to accomplish a task.
This is the only way they can work. You cannot isolate data in one
thread from another thread. You should not even try.
Sorry, all data is shared with all threads. There is nothing you can
do. If you don't want this, you need to use distinct processes rather
than threads.
> The above design is our initial design and was decided before reading
> all the comments in this thread. I am going through the process of
> redesigning and would incorporate the possible changes mentioned in
> this thread.
Good luck. This is a common anti-pattern. You may not want to go
through the trouble of fixing it in this particular application, as it
likely won't hurt you all that badly and may not be worth the effort
needed.
It will work with a manager thread. It's just extra work to make extra
inefficiency.
> Appreciate your help.
You're welcome. Don't do it that way next time. ;)
Think of threads as execution vehicles. They're what your process uses
to get work done. But the code should focus on what work need to get
done and the status of that work, not what thread happens to be doing
it or the status of that thread.
DS
I am sorry, I could not understand this part. What could possibly be a
way to check for some stuck task?
If the stuck task is waiting on a blocking call, how would the timer
thread know if it is stuck or not. I can not not break the blocking
call to respond to timer thread and I can not use a non blocking call
in loop with some "I am alive" signal as it is a diagnostic running on
a hardware which has to be exactly the way it was designed by some
third person(I do not have control over it).
> > Hence the manager comes in. The Manager does not mess with prototype
> > hardware and hence is unlikely to go in I/O wait or hang. Also manager
> > has some Manager specific data. We would not want that data to be
> > global and shared with all threads in the application.
>
> You don't understand threads. Sorry to be so blunt, but you don't.
>
> Threads are execution vehicles that cooperate to accomplish a task.
> This is the only way they can work. You cannot isolate data in one
> thread from another thread. You should not even try.
>
> Sorry, all data is shared with all threads. There is nothing you can
> do. If you don't want this, you need to use distinct processes rather
> than threads.
>
No, I am not offended. I am here to learn.
I am using thread specific data key for my Manager and even if I
don't, I do not understand how is the local data of my Manager initial
function available to all other threads?
> > The above design is our initial design and was decided before reading
> > all the comments in this thread. I am going through the process of
> > redesigning and would incorporate the possible changes mentioned in
> > this thread.
>
> Good luck. This is a common anti-pattern. You may not want to go
> through the trouble of fixing it in this particular application, as it
> likely won't hurt you all that badly and may not be worth the effort
> needed.
>
> It will work with a manager thread. It's just extra work to make extra
> inefficiency.
>
I do not understand what are the benefits of using a timer thread
which will periodically check for hung task over a Manager which will
wait for a signal from terminating thread?
Manager code is always gonna run no matter whether I choose your
design or mine. Whether timer or Manager, there is an extra thread
overhead in both cases.
One argument I can think of is, the worker thread has to wait while
Manager makes its decision on which task to schedule next. But in any
case, the Manager code has to be under mutex in critical section as
only on thread should be allowed to execute it at one time. So the
only extra overhead with Manager is the signal but it saves me the
overhead of executing timer code and polling for status of stuck task.
I am not creating and destroying thread(I have to fix that).
> I am sorry, I could not understand this part. What could possibly be a
> way to check for some stuck task?
> If the stuck task is waiting on a blocking call, how would the timer
> thread know if it is stuck or not. I can not not break the blocking
> call to respond to timer thread and I can not use a non blocking call
> in loop with some "I am alive" signal as it is a diagnostic running on
> a hardware which has to be exactly the way it was designed by some
> third person(I do not have control over it).
I guess I don't understand something. Presumably these threads were
calling some code when they were finished with tasks. Otherwise, there
would be no way for the manager thread to know when they were done. So
why can't this code set some kind of 'done flag' that you can check
any time you need to?
You could solve this problem two obvious ways:
1) When you finish a task or start a task, you update a 'task status'
block with a timestamp or whatever information is useful. The manager
timer task can check these status blocks any time it needs to.
2) Before you enter a call that could block, and after you return from
it, you update a 'task status' block.
> I am using thread specific data key for my Manager and even if I
> don't,
That's fine. Thread-specific data is global and can be accessed by any
thread (assuming you mean POSIX TSD, not some compiler extension).
> I do not understand how is the local data of my Manager initial
> function available to all other threads?
Okay, something's not making sense here. Either I'm not explaining
things clearly or something, but I don't know what. Your manager was
somehow telling these threads what tasks to do. Now you are claiming
you have no way to communicate with them from the manager code? How
does that make sense?
Somewhere, you have to have some kind of structure that identifies
what these threads are doing or that they use to report completion or
something. Even if you call pthread_create, you have to pass it
*something*. That something can contain any information you want,
including the address of structures you previously treated as local.
> > Good luck. This is a common anti-pattern. You may not want to go
> > through the trouble of fixing it in this particular application, as it
> > likely won't hurt you all that badly and may not be worth the effort
> > needed.
> > It will work with a manager thread. It's just extra work to make extra
> > inefficiency.
> I do not understand what are the benefits of using a timer thread
> which will periodically check for hung task over a Manager which will
> wait for a signal from terminating thread?
When another thread finishes a task, in your scheme, it must yield to
the manager thread before it can start on another task. That means two
senseless context switches.
In the worst case, your pattern is:
Manager runs,
Manager creates thread
Thread runs
Destroy thread
Manager notices thread is destroyed
repeat
Mine is:
Thread runs
repeat
> Manager code is always gonna run no matter whether I choose your
> design or mine. Whether timer or Manager, there is an extra thread
> overhead in both cases.
A thread that runs rarely to do a minor task is not significant
overhead. A thread that has to be run every time you want to do work
is.
> One argument I can think of is, the worker thread has to wait while
> Manager makes its decision on which task to schedule next.
Exactly. This means a minimum of two extra context switches. It also
means that if the manager thread ever blocks for some reason, you
cannot start any work until it unblocks.
> But in any
> case, the Manager code has to be under mutex in critical section as
> only on thread should be allowed to execute it at one time. So the
> only extra overhead with Manager is the signal but it saves me the
> overhead of executing timer code and polling for status of stuck task.
Assuming it takes significantly more time to do work than to start
work and the number of threads is not unreasonable, the manager mutex
will almost never be contended. An uncontended mutex is nearly free. A
context switch is several orders of magnitude more expensive.
> I am not creating and destroying thread(I have to fix that).
Good. Since you won't be creating and destroying threads, you can't
use 'pthread_create' to pass the job information or 'pthread_join' to
get the return value. As a result, you're already doing 85% of the
work to eliminate the manager thread as a job dispatch bottleneck.
DS
The threads were signaling a condition variable which Manager was
waiting on.
The thread also have a persistent structure which can store done flag.
However done flag would not be sufficient as once done is set and
thread goes to run another task, flag will still be done even if next
task hangs. I can set some timestamps but the timer thread has to do
the work of going through each tasks flag to make sure it is within
time limit.
> You could solve this problem two obvious ways:
>
> 1) When you finish a task or start a task, you update a 'task status'
> block with a timestamp or whatever information is useful. The manager
> timer task can check these status blocks any time it needs to.
>
> 2) Before you enter a call that could block, and after you return from
> it, you update a 'task status' block.
>
> > I am using thread specific data key for my Manager and even if I
> > don't,
>
> That's fine. Thread-specific data is global and can be accessed by any
> thread (assuming you mean POSIX TSD, not some compiler extension).
>
> > I do not understand how is the local data of my Manager initial
> > function available to all other threads?
>
> Okay, something's not making sense here. Either I'm not explaining
> things clearly or something, but I don't know what. Your manager was
> somehow telling these threads what tasks to do. Now you are claiming
> you have no way to communicate with them from the manager code? How
> does that make sense?
>
The manager does pass a structure to the thread. This structure(say a)
is a member of another structure(say A). The thread only gets A.a
A is the main structure which is a node in the link list of tasks.
This link list is created by Manager which has access to all nodes but
a thread only gets A.a(Technically yes link list can be accessed by
any thread as it is malloc'd but threads do not know how to)
There is some data manager maintains say "total" which is local to
manager and not passed to any thread. I have to make all these local
variables global and safeguard them with mutex.
From my understanding:-
Mine is:
Manager create threads and wait(with timed wait to catch hung tasks)
Thread runs
Thread Signals
Manager does calculation and reschedules task to the thread
Repeat
If I implement yours:
Threads Run
Thread completes and wait on manager code mutex and runs manager code
(Manager code can not be run by two threads at same time. Some
calculations might create problem. Also now all threads need to have
access to each others data as it is required in calculation)
Thread gets new task from manager code result.
Repeat
Also Timer thread is running periodically.
> > Manager code is always gonna run no matter whether I choose your
> > design or mine. Whether timer or Manager, there is an extra thread
> > overhead in both cases.
>
> A thread that runs rarely to do a minor task is not significant
> overhead. A thread that has to be run every time you want to do work
> is.
>
> > One argument I can think of is, the worker thread has to wait while
> > Manager makes its decision on which task to schedule next.
>
> Exactly. This means a minimum of two extra context switches. It also
> means that if the manager thread ever blocks for some reason, you
> cannot start any work until it unblocks.
>
> > But in any
> > case, the Manager code has to be under mutex in critical section as
> > only on thread should be allowed to execute it at one time. So the
> > only extra overhead with Manager is the signal but it saves me the
> > overhead of executing timer code and polling for status of stuck task.
>
> Assuming it takes significantly more time to do work than to start
> work and the number of threads is not unreasonable, the manager mutex
> will almost never be contended. An uncontended mutex is nearly free. A
> context switch is several orders of magnitude more expensive.
>
Agreed.
> > I am not creating and destroying thread(I have to fix that).
>
> Good. Since you won't be creating and destroying threads, you can't
> use 'pthread_create' to pass the job information or 'pthread_join' to
> get the return value. As a result, you're already doing 85% of the
> work to eliminate the manager thread as a job dispatch bottleneck.
>
> DS
This is what I have gathered from our discussion(Correct me if I am
wrong):-
1) The bottleneck to get new task to schedule will always be there as
it is inherited by the application(Manager code mutex and
calculation).
2) I do agree that I will be saving the expense of switching to
manager task with your design.
3) However your design does have a little overhead of timer thread
going through each task's status periodically and to allow all threads
share all data.
I think I need time to actually sit and think with my application
requirements in hand to make sure how and if I could implement your
design or may be implement some features.
Appreciate your time. Helped me a lot to gain some good knowledge.
I've just recently been dipping my toes into pthreads, and I've
been wondering about that.
Since pthreads are pretty well thought out, I assume that there are
good reasons for not having that function, but the reasons why it
would be a Bad Thing don't seem obvious to me.
It's probably just that my threaded experience is limited to pretty
simple examples. That and that my thinking is skewed by multi-process
thinking and mapping pthread_join() to wait().
Is there a relatively simple explanation?
--
Drew Lawson
I only came in search of answers, never planned to sell my soul
I only came in search of something left that I could call my own
You mean the Manager thread, rather than Manager module?
Once a worker thread is done with its task and wants to tell that to the
manager thread, even if just by calling pthread_exit(), it is no more
likely than the manager thread to do any more I/O. Unless your or the
task's code has a bug:
- if the task has overwritten its caller's stack or data structures.
But then, if you don't trust it to refrain from that, you probably
can't trust it to not muck with other data structures in the program
either. Such as the manager module's data.
- if the task has set up an alarm or can get SIGCHLD or something, and
may call a signal handler after the task is supposedly done. But
while signals can be set up to delivered to a specific thread, signal
handlers are per-process. So if a task sets a signal handler it can
in any case ruin other tasks using the same signal for something else.
If this worries you, do not use threads. Use a separate process for
each task. That way, any mess the task made for your data structures or
signals or whatever, goes away with the process which executed the task.
If that's not a problem, then when a task is done, then it is done. It
doesn't make a difference whether the thread which performed the task
just calls pthread_exit() or runs some more complicated code which is
part of the manager module. Like waiting for a new task to perform.
_Something_ must execute the code in the manager module. pthread_exit()
is in this context a little part of that code: it informs the manager
thread that the worker thread is done. Manager code in the worker
threads with mutexes/conds in a thread pool are more complex, but that's
all.
I wonder if you think threads are more magical then they really are.
However threads have even been implemented purely in user space. Such a
library must provide wrappers around system calls, e.g. blocking I/O
must be turned into non-blocking or select(), where an EWOULDBLOCK
return results in the thread library doing a context switch.
--
Hallvard
> > I guess I don't understand something. Presumably these threads were
> > calling some code when they were finished with tasks. Otherwise, there
> > would be no way for the manager thread to know when they were done. So
> > why can't this code set some kind of 'done flag' that you can check
> > any time you need to?
> The threads were signaling a condition variable which Manager was
> waiting on.
> The thread also have a persistent structure which can store done flag.
> However done flag would not be sufficient as once done is set and
> thread goes to run another task, flag will still be done even if next
> task hangs.
If the thread goes to run another task, *that* task is not done. The
task it finished is still done.
> I can set some timestamps but the timer thread has to do
> the work of going through each tasks flag to make sure it is within
> time limit.
Or you can set a timer for each task. It depends on how your code is
structured. Ideally, you would already have some kind of sensible
timer infrastructure. But if not, you can fake it with a scan.
> > Okay, something's not making sense here. Either I'm not explaining
> > things clearly or something, but I don't know what. Your manager was
> > somehow telling these threads what tasks to do. Now you are claiming
> > you have no way to communicate with them from the manager code? How
> > does that make sense?
> The manager does pass a structure to the thread. This structure(say a)
> is a member of another structure(say A). The thread only gets A.a
> A is the main structure which is a node in the link list of tasks.
Okay. The list of tasks is global.
> This link list is created by Manager which has access to all nodes but
> a thread only gets A.a(Technically yes link list can be accessed by
> any thread as it is malloc'd but threads do not know how to)
You are still thinking about this all wrong. You say "threads do not
know how to". Threads just run code. It's the code that either knows
how to or doesn't.
> From my understanding:-
>
> Mine is:
> Manager create threads and wait(with timed wait to catch hung tasks)
> Thread runs
> Thread Signals
> Manager does calculation and reschedules task to the thread
> Repeat
Right. Which means a thread cannot go from doing X to doing Y without
context switches.
> If I implement yours:
> Threads Run
> Thread completes and wait on manager code mutex and runs manager code
> (Manager code can not be run by two threads at same time. Some
> calculations might create problem. Also now all threads need to have
> access to each others data as it is required in calculation)
> Thread gets new task from manager code result.
> Repeat
> Also Timer thread is running periodically.
Exactly. In the most common case, the mutex will be uncontended. A
worker thread will almost never have to wait for the manager and there
will almost never be extra context switches when work is waiting to be
done.
> 1) The bottleneck to get new task to schedule will always be there as
> it is inherited by the application(Manager code mutex and
> calculation).
That's fine. That's just a tiny bit of code. As long as you don't
force the scheduler to invoke a specific thread in order to get it
done, and allow the thread that's already running to do it, it will be
nearly free.
> 2) I do agree that I will be saving the expense of switching to
> manager task with your design.
> 3) However your design does have a little overhead of timer thread
> going through each task's status periodically and to allow all threads
> share all data.
You have to do that anyway. Your manager thread had to check on the
status of running threads periodically already.
> I think I need time to actually sit and think with my application
> requirements in hand to make sure how and if I could implement your
> design or may be implement some features.
It may not be worth the effort unless your tasks typically take very
little time. The bigger each task is, the less overhead in dispatching
them matters. You don't need to do every single thing the most
efficient possible way. (Especially if it's already working.)
> Appreciate your time. Helped me a lot to gain some good knowledge.
You're welcome.
DS
Solaris thr_join() lets you wait for any thread.
The Posix spec describes pthread_join() as just a convenience function.
So maybe the POSIX folks just didn't take the function very seriously,
and thus didn't spend much time on it?
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_join.html
"RATIONALE
The pthread_join() function is a convenience that has proven useful in
multi-threaded applications. It is true that a programmer could
simulate this function if it were not provided by passing extra state
as part of the argument to the start_routine(). The terminating thread
would set a flag to indicate termination and broadcast a condition
that is part of that state; a joining thread would wait on that
condition variable. (...)"
--
Hallvard
> Since pthreads are pretty well thought out, I assume that there are
> good reasons for not having that function, but the reasons why it
> would be a Bad Thing don't seem obvious to me.
> It's probably just that my threaded experience is limited to pretty
> simple examples. That and that my thinking is skewed by multi-process
> thinking and mapping pthread_join() to wait().
> Is there a relatively simple explanation?
Yep, a very simple explanation. It would add cost to every thread
destruction.
The POSIX way is to provide cheap primitives. This way applications
that don't require more don't have to pay the cost of complexities
they won't use.
Using thread termination as a communicated event is an anti-pattern
anyway. You never care that a thread is gone, you care that it
finished the job it was doing. Waiting for thread termination prevents
the thread from doing something else while it is executing when it
finishes the job the thread was waiting for, which forces extra
context switches for no reason.
DS
It's more than that; "join any" would break modularity. It's simply wrong.
People tend to compare pthread_join() with UNIX wait*() functions, but
it's not analogous.
UNIX processes have a hierarchy. A process waits only for its own
children, and cannot accidentally "steal" a process termination status
from a peer. This is NOT the case with threads, which have no
parent/child relationships!
If libA creates a set of threads and expects completion status from
them, and libB (or main program) creates a set of threads and uses "join
all", then there is no way to prevent the risk that libB's join will
retrieve status from one of libA's threads. libB doesn't understand the
semantics of the termination, and libA will never see it at all. This
could easily break either facility, or both, depending on their
expectations with respect to return values and thread count.
There would be several ways to do "multiple join" in POSIX:
a) A real "join many" accepting a list of pthread_t values. Possible,
of course; but not "primitive".
b) A process-like hierarchy of threads so that each thread can only
join its own children. Introducing an enormous amount of complexity
that's not really very useful, and in fact would get in the way
more often... for example cooperating worker threads can't join
each others' children.
c) Thread "groups" that could be dynamically defined, allowing a "join
group" operation. Again, much more complexity and mechanism than
could really be justified by the gain.
If you create your own "join" with your own shared data and condition
variables, it's almost embarrassingly trivial to create a "join many"
that will wait for any of YOUR threads, but won't disturb any other
modular facility in the same process. Your semantics become flexible and
infinitely customizable.
Use pthread_join() when the simple function provides exactly what you
need, with no compromises. If it's not exactly what you need, do it
yourself with whatever semantics you need.
I suspected that was my concept error.
>If libA creates a set of threads and expects completion status from
>them, and libB (or main program) creates a set of threads and uses "join
>all", then there is no way to prevent the risk that libB's join will
>retrieve status from one of libA's threads. libB doesn't understand the
>semantics of the termination, and libA will never see it at all. This
>could easily break either facility, or both, depending on their
>expectations with respect to return values and thread count.
Thanks. That makes it clear. I work within my own task's space,
so I tend to think from that perspective. That the threads cross
those bounds hadn't become clear to me.
>If you create your own "join" with your own shared data and condition
>variables, it's almost embarrassingly trivial to create a "join many"
>that will wait for any of YOUR threads, but won't disturb any other
>modular facility in the same process. Your semantics become flexible and
>infinitely customizable.
That's what I did when I needed it (or thought I did). I thought
it was a bit of a kludge at the time, but now I'm thinking it may
have been the best solution.
--
Drew Lawson | Savage bed foot-warmer
| of purest feline ancestry
| Look out little furry folk
| it's the all-night working cat