I would create a queue ("Queue" module) and pass it to each child; the
child would put something (eg a reference to itself) in the queue when
it had finished.
You are using the wrong approach here. There is a much easier solution
to your problem although it's not obvious in the first place. In this
approach the worker threads don't have to notify the parent thread that
they are ready.
The main thread creates a queue and a bunch of worker threads that are
pulling tasks from the queue. As long as the queue is empty all worker
threads block and do nothing. When a task is put into the queue a random
worker thread acquires the task, does it job and sleeps as soon as its
ready. That way you can reuse a thread over and over again without
creating a new worker threads.
When you need to stop the threads you put a kill object into the queue
that tells the thread to shut down instead of blocking on the queue.
Christian
>d> could i see an example of this maybe?
queue is a shared Queue instance.
The parent puts the work in the queue with queue.put(work_object).
After all work has been put in the queue it puts a sentinel
queue.put(None)
Each child thread (consumer) has a loop getting things out of the queue:
while True:
work = queue.get()
if work is None: break
#do real stuff using work
queue.put(None) #put back the sentinel for other worker threads.
#finish
(Thanks to Dennis Lee Bieber)
Instead of None you can also create a special sentinel object and use that.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org
Let's get back to topic of this message.
Here's how I have implemented it so far, and I am taking the queue of
work load items approach.
In my child thread, I will keep checking for available work load item
until a duration is reached.
#inside the child#
while endTime > time.time():
try:
item = self.q.get(True, 3)
except Queue.Empty: #what's wrong? AttributeError: class
Queue has no attribute 'Empty'
print 'cant find any work load item, so lets wait and
try again later'
time.sleep(1) #wait and then check again
continue
except:
print "Unexpected error:", sys.exc_info()[0]
raise
#do the real work with load item
In my parent thread, I will initialize X (depending on a cfg file)
child threads and keep adding load items to a shared q until the
duration is reached.
#inside the parent#
callCounter = 0
workers = [] #a list of child threads
totalWorkers = 250
endTime = time.time() + duration
for i in range(totalWorkers):
w = Worker(q, duration, i)
w.start() #worker, do your job now!
workers.append(w)
while endTime > time.time():
time.sleep(1)
q.put(getWorkloadItem()) #add workload itmes
callCounter += 1 #actually can we guarantee that the
call will be sent??
#should we ask each child to report
the number of calls they make?
for i in range(totalWorkers):
workers[i].join() # Wait for the child threads to
finish
Overall, it seems to be working now. Though, I still have a couple of
problems to resolve.
1. I got the following error for the codes that attempt to catch Empty
Queue exception. What's the right way to use it?
except Queue.Empty:
AttributeError: class Queue has no attribute 'Empty'
2. What's the best way to have each child thread to report the number
of requests they send when they are done? To add the numbers to
another queue?
3. I will need to do some logging for response time as well as some
response contents. I have two choices, one big log file for all
threads (both child and parent), and one log file for each thread.
Given the fact that I may have to log tons of data, I think opening
and maintaining a bunch of smaller logs may be better than dealing
with a big one (it may grow very fast). Is there any best prastice
for logging in Python? If I change my mind and go with one big log
file (pass it to each thread), is there anything I should be aware of
for multi-thread access (writting) to the same log file?
Again, thank you.
The exception 'Empty' belongs to the module, not the class. Try
importing as:
from Queue import Queue, Empty
> 2. What's the best way to have each child thread to report the number
> of requests they send when they are done? To add the numbers to
> another queue?
>
Why not? :-)
> 3. I will need to do some logging for response time as well as some
> response contents. I have two choices, one big log file for all
> threads (both child and parent), and one log file for each thread.
> Given the fact that I may have to log tons of data, I think opening
> and maintaining a bunch of smaller logs may be better than dealing
> with a big one (it may grow very fast). Is there any best prastice
> for logging in Python? If I change my mind and go with one big log
> file (pass it to each thread), is there anything I should be aware of
> for multi-thread access (writting) to the same log file?
>
> Again, thank you.
If you like threads then you could put the log items into a queue and
have another thread writing them to the logfile. :-)
BTW, do you really need 250 threads? Seems like a lot.
I notice that you stop putting items into the queue when endTime is
reached and also the threads terminate when endTime is reached. If items
are put into the queue faster than they're taken out (or an item is put
in just before endTime) then there might still be unprocessed items in
the queue at endTime.
#from the logger thread#
def run(self):
while self.flag == 1: #if the flag is set to 0, the logger
thread should exit
try:
entry = self.q.get()
except Empty:
self.logger.debug('cant find any log entry')
continue
except:
self.logger.error("Unexpected error:", sys.exc_info()
[0])
raise
#do whatever that should be done
self.logger.info("logger thread done") #should see this
message in log file as well
def off(self):
self.logger.info('turning off flag')
self.flag = 0
#in parent thread#
logItemQ.put('We are done, lets stop the logger now.')
time.sleep(1) #it seems that the logger thread cannot exit if
I put a sleep here
myLog.off() #off is called successfully
myLog.join()
I put an off method to turn off a flag so the logger thread knows it
should exit. However, the last log message (the one 'We are done,
lets stop the logger now.') won't be logged if I call myLog.off() and
myLog.join() immediately. So, I put a time.sleep(1) to make sure the
logger thread has enough time to finish it's job. Unfortunately, now
the logger thread simply won't exit, and I don't see the message
'logger thread done'. I can't figure out at which point it hangs,
since I don't any new log entry but the thread simply won't exit.
Am I taking a right approach by using a flag? Should I lock the flag?
self.q.get() will block if the queue is empty, so the Empty exception is
never raised.
What's happening is that the parent thread puts the final message into
the queue, sleeps, and then clears the flag; meanwhile, the logging
thread gets the message, writes it out, checks the flag, which is still
set, and then tries to get the next message. The queue is empty, so the
.get() blocks.
The simplest solution is not to use a flag, but the sentinel trick. The
parent thread can put, say, None into the queue after the last message;
when the logging thread gets None, it knows it should terminate.
>M> self.q.get() will block if the queue is empty, so the Empty exception is
>M> never raised.
>M> What's happening is that the parent thread puts the final message into
>M> the queue, sleeps, and then clears the flag; meanwhile, the logging
>M> thread gets the message, writes it out, checks the flag, which is still
>M> set, and then tries to get the next message. The queue is empty, so the
>M> .get() blocks.
>M> The simplest solution is not to use a flag, but the sentinel trick. The
>M> parent thread can put, say, None into the queue after the last message;
>M> when the logging thread gets None, it knows it should terminate.
To the OP: a sleep will never do it because you can't be sure how long
the logging thread will lag behind. Most multithreaded `solutions' which
depend on timings are buggy.
If you have more than one thread writing to the log queue one sentinel
won't solve the problem. You would need as many sentinels as there are
threads, and the logger should count the sentinels which can have normal
messages between them. It may be a bit fragile to depend on the number
of threads especially when this number is dynamic. One solution would be
to have special StartLogging and StopLogging objects that are put in the
queue when a thread starts and finishes respectively. The logger can stop
then when the number of StopLogging objects equals the number of
StartLogging objects received and that number is greater than 0. This
presupposes that no new thread will be started after all the others have
finished. The StartLogging objects can also be put in the queue by the
parent when it starts up the threads, which may be more robust because
it has fewer timing dependencies.