Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

TCL threads with dependencies and TK bugging down main thread

223 views
Skip to first unread message

Frank

unread,
Mar 6, 2014, 12:38:51 AM3/6/14
to
Hi,

I am writing an application using threads that has the following situation.

Task 1
Step1 = A
Step2 = B
Step3 = C=A+B
Step4 = TK process of C


The above configuration can be applied to several tasks. Each step is sent to the child threads with the exception of Step4 that is run in the main thread as it is TK dependent and TK is not multi-threaded.

As you can see A and B are independent steps, C depends on A and B completion and D is TK dependent. Any tasks that have completed Step3 will want to run Step4 in the main Thread. Therefore potentially I will have to serialize the step4 for the different tasks.

The way I currently have it coded I am pulling in a variable (TSV) shared between all the threads. When a thread is available flagged by the variable then I submit the next task.

The problem is that checking on the variable bugs down my main thread and is bugging down the computer and servicing the Step4s that complete.

What is the best way to keep an eye on a thread variable and take action on it in an asynchronous way? For pipes we have fileevent.

If there a better way to implement the situation above with threads I would be very happy to hear it. :) If you have some code even better. :)

Thanks in advance,
Frank

Christian Gollwitzer

unread,
Mar 6, 2014, 1:20:30 AM3/6/14
to
Am 06.03.14 06:38, schrieb Frank:
> Hi,
>
> I am writing an application using threads that has the following situation.
>
> Task 1
> Step1 = A
> Step2 = B
> Step3 = C=A+B
> Step4 = TK process of C
>
> The way I currently have it coded I am pulling in a variable (TSV)
> shared between all the threads. When a thread is available flagged by
> the variable then I submit the next task.
>
> The problem is that checking on the variable bugs down my main thread
> and is bugging down the computer and servicing the Step4s that
> complete.

Are you doing busy waiting on the thread shared variable? The way I
usually do it is by passing a message to the main thread back from the
computing threads like

thread::send -async $main [list finished $result]

If you have a large amount of data to pass, then use a TSV instead of
appending the result directly. IIRC thread::send stringifies its
arguments. I assume you could also put a trace on the TSV to see if it
is changed.

Christian

Frank

unread,
Mar 6, 2014, 10:04:06 AM3/6/14
to
On Thursday, March 6, 2014 12:20:30 AM UTC-6, Christian Gollwitzer wrote:
> Am 06.03.14 06:38, schrieb Frank:
>
Hi Christian,

Thanks for the reply.

I think the main thread is getting bugged down when there are threads available but the conditions are not met to submit a new job. For example Steps 1 and 2 are not complete for a particular task.

I believe that having a trace on the thread complete variable can even break the process of submitting a new Step.

Somehow I need to match the thread availability with the dependencies to execute Step 3 and Step 4. Maybe have a check when Step1 and Step2 complete to check if their corresponding Step1 and Step2 are ready and setup a var for step3_ready to execute. But this also has the problem with the trace on the variable interrupting a submission or bugging down the main thread looking at the variable.

Regards,
Frank

Rich

unread,
Mar 6, 2014, 12:51:23 PM3/6/14
to
Frank <kra...@gmail.com> wrote:
: Hi,

: I am writing an application using threads that has the following
: situation.

: Task 1
: Step1 = A
: Step2 = B
: Step3 = C=A+B
: Step4 = TK process of C


: The above configuration can be applied to several tasks. Each step is
: sent to the child threads with the exception of Step4 that is run in
: the main thread as it is TK dependent and TK is not multi-threaded.

: As you can see A and B are independent steps, C depends on A and B
: completion and D is TK dependent. Any tasks that have completed Step3
: will want to run Step4 in the main Thread. Therefore potentially I
: will have to serialize the step4 for the different tasks.

: The way I currently have it coded I am pulling in a variable (TSV)
: shared between all the threads. When a thread is available flagged by
: the variable then I submit the next task.

: The problem is that checking on the variable bugs down my main thread
: and is bugging down the computer and servicing the Step4s that
: complete.

Threads in Tcl are different from threads in other languages in a few
subtle, but important, ways. One of those differences is that Tcl
threads run an event loop internally that allows using them in a
message passing manner.

You could structure your application like so (I'm assuming "step4" is
the master submitter of "jobs" as well, and I'm going to refer to your
"steps" as "threads"):

Thread 4 has a "job" containing an A and a B to process. Thread 4
picks a unique "job id" for this job, then submits, by message passing,
part A (with jobid) and part B (with jobid) to threads 1 and 2
respectively. If you use the -async flag to thread::send then this
means that Thread 4 will quickly send off the "jobs" and return to
handling the GUI.

Thread 1 performs step A, and via message passing (thread::send -async)
submits resultA+jobid to thread 3.

Thread 2 performs step B, and via message passing submits resultB+jobid
to thread 3.

Thread 3 receives incoming "results", and based upon jobid does one of
two things:
1) half a result set received - store result+jobid for later
2) both halves now received - perform C=A+B

When thread 3 has resultC, it submits, via message passing, resultC +
jobid (jobid may not be needed here, depending upon what you are
tracking in thread 4) to thread 4, deletes its temporary storage of
partial results, and returns to its event loop.

Thread 4 simply waits in its event loop for incoming results messages
from thread 3, and does whatever is appropriate when a message is
received.

With the above structure, no thread is ever busy waiting on a shared
variable. They are all just waiting in their respective event loops
for work to arrive. Which means your thread 4 gui will be responsive,
and thread 4 won't be wasting significant CPU simply checking on a
shared variable.

Frank

unread,
Mar 6, 2014, 9:12:19 PM3/6/14
to
Hi Rich,

Thanks for the reply. Your approach is very interesting. Making the threads a little smarter could definitely help.

I could have a TSV that could keep track of which thread has which step. When a thread finishes the step it could lock query and update the TSV. Then it can send the data to the thread waiting for it as you suggested.

The problem I see is when the number of tasks times steps exceed significantly the number of threads available. The thread for Step3 which is waiting the data for steps 1 and 2 may not be up or processing that step to receive the data.

I thought about pre-assigning the tasks to the different threads at the beginning but because they can finish out of order for a particular task the steps 1, 2 and 3 could be fairly fast while for other tasks they could consume a lot more time. Therefore I could have a thread waiting for the data from steps 1 and 2 (which are going to take a long time to execute) while there are steps from other tasks that can execute very fast and are not running because there are no threads available.

Regards,
Frank

Frank

unread,
Mar 6, 2014, 9:12:43 PM3/6/14
to

Rich

unread,
Mar 6, 2014, 9:47:37 PM3/6/14
to
Frank <kra...@gmail.com> wrote:
> Hi Rich,

> I could have a TSV that could keep track of which thread has which
> step. When a thread finishes the step it could lock query and update
> the TSV. Then it can send the data to the thread waiting for it as
> you suggested.

You could, but you don't need to do so. Tcl's threads don't work like
C threads. They work much more like Smalltalk objects. They receive
messages and do something in response.

> The problem I see is when the number of tasks times steps exceed
> significantly the number of threads available. The thread for Step3
> which is waiting the data for steps 1 and 2 may not be up or
> processing that step to receive the data.

This is handled internally by Tcl and the Thread extension. The
message and data will simply be placed on the threads event queue, and
will wait until the thread next pulls an event from its event queue.
I.e., the event queue message submission is always available, even if
the thread is otherwise busy.

> I thought about pre-assigning the tasks to the different threads at
> the beginning but because they can finish out of order for a
> particular task the steps 1, 2 and 3 could be fairly fast while for
> other tasks they could consume a lot more time. Therefore I could
> have a thread waiting for the data from steps 1 and 2 (which are
> going to take a long time to execute) while there are steps from
> other tasks that can execute very fast and are not running because
> there are no threads available.

Then consider running a thread pool (man tpool) instead. Set up a
thread pool, and initialize X threads, each capable of performing all
three of A, B and C as operations. Submit your A's and B's into the
pool using -detached and -nowait options. Have the A and B ops message
a "collector" thread running outside the pool when they are done to
combine the proper A with B and resubmit a C operation into the pool.
Setup C to submit the answer to the Tk GUI thread.

Until you reach the point that you have a job running in every thread
you allocate, this will allow faster A's, B's, and C's to bypass slower
ones.

You can extend this metaphor to three pools, one each for A, B, and C
if you like.

Frank

unread,
Mar 6, 2014, 10:59:25 PM3/6/14
to
On Thursday, March 6, 2014 8:47:37 PM UTC-6, Rich wrote:
Hi Rich,

Thanks again.

As you can tell I am new to threads. :)

The idea of having a separate thread collecting the A and B parts and submitting a job to the queue once the parts are available is pretty cool.

I looked into tpool before but got stuck solving the A and B dependency as tpool seems to work nice when there are no dependencies. Having my main thread waiting for messages from the A&B threads to submit my step_3(C) did not solved the problem I was having. Therefore I abandoned looking into tpool. The approach of having another thread to do this should work fine.

Do you have any pointers to examples using tpool? I did some searching and did not come up with something meaningful. I want to do some testing before I start ripping apart my application to make the changes using tpool. :) Hopefully I can code/modify an example similar to what I am describing here and be able to test it.

I believe I have to do a thread::wait on the thread monitoring the A&B tasks completion. Once the thread receives the thread::send -async from the other threads it can check if they associated task is done. If both tasks are done submit to tpool if not done same status and wait again with thread::wait.

Regards,
Frank

Christian Gollwitzer

unread,
Mar 7, 2014, 2:45:46 AM3/7/14
to
Am 07.03.14 04:59, schrieb Frank:
> As you can tell I am new to threads. :)

New to threads in general, or new to threads in Tcl?

> The idea of having a separate thread collecting the A and B parts and
> submitting a job to the queue once the parts are available is pretty
> cool.
> [...]
> I believe I have to do a thread::wait on the thread monitoring the
> A&B tasks completion.

You should not explicitly call any function with the name "wait", if you
can avoid it. When you have a GUI which is supposed to control the
tasks, then this should be running in the main thread of your program,
not doing any significant work besides distributing the tasks to the
other threads. It will then sit in the event loop and never feel
sluggish to the user. The computing threads should signal completion of
the tasks, with the main thread redistributing the work.

It is not clear to me from the description you gave, do you have a 1000
As and Bs which need to be processed, or is it just some very long A,
very long B, and long C task? Depending on that yould select a different
strategy.

For example, if there are many of that tasks, you can simply run A,B,C
in sequence in one thread and start a 1000 of these using threadpool.

In contrast, if there is only a single such sequence, you just start
three threads and manage the distribution by hand like this

# in thread A
proc A {inp} { work; thread::send -async $main [list finish A $result] }

# in thread B
proc B {inp} { work; thread::send -async $main [list finish B $result] }

# in thread C
proc C {Ares Bres} {
work;
thread::send -async $main [list finish C $result]
}

# in GUI
proc doit {} {
thread::send -async $A [list A $Ainp]
thread::send -async $B [list B $Binp]
}

proc finish {task answer} {
variable results
set results($task) $answer

# if A and B are ready schedule C
if {$results(A) != "" && $results(B) != "" } {

thread::send -async $C [list C $results(A) $results(B)]
set results(A) ""
set results(B) ""
return
}

if {$task == "C" } {
# finished; display
.output insert end $results(C)
}
}


Also, avoid the thread shared variables if it is not necessary. TSVs are
really more an optimization to pass large amounts of data between the
threads, but there are other means, too. E.g., if you read in a big
image and compute a single number from it, simply read the image from
the thread doing the computation and pass the number back with the
message. Another use for a TSV might be to signal a cancellation request
from the user. However, this can be done using update and simple
variable, too.

Christian

Rich

unread,
Mar 7, 2014, 8:24:16 AM3/7/14
to
Frank <kra...@gmail.com> wrote:
> On Thursday, March 6, 2014 8:47:37 PM UTC-6, Rich wrote:
> Hi Rich,

> Thanks again.

> As you can tell I am new to threads. :)

> The idea of having a separate thread collecting the A and B parts and
> submitting a job to the queue once the parts are available is pretty cool.

That allows for creating a "join" point, without the typical blocking wait
of a typical "join" point.

> I looked into tpool before but got stuck solving the A and B dependency as
> tpool seems to work nice when there are no dependencies.

It does work easiest when there are no interdependencies. But that is true
of any computation. The ones without interdependencies are easier to design
and code.

> Having my main thread waiting for messages from the A&B threads to submit
> my step_3(C) did not solved the problem I was having.

If by "waiting" you mean thread::wait, then that was the problem. Don't
think in terms of "C waits for A and B". Think in terms of "A pushes
answers to C" and "B pushes answers to C" and "C combines A & B once both
arrive". I.e., think of how you'd work in an office if you were manually
doing A then passing along your answer to Fred to perform C. Fred would not
sit there looming over you waiting for you to do A (well, not if he wanted
to get anything else done. Instead, when you finished an A you'd walk your
answer over to Fred's desk and put it into his inbox on the corner. At some
point after you do that, Fred would wake up from his nap, notice the A from
you, and do something with it.

> Therefore I abandoned looking into tpool. The approach of having another
> thread to do this should work fine.

The "collector" thread is what simulates the join point.

> Do you have any pointers to examples using tpool? I did some searching and
> did not come up with something meaningful.

Unfortunately, no, I do not. And there does seem to be a dearth of examples
on the web as well.

> I want to do some testing before I start ripping apart my application to
> make the changes using tpool. :) Hopefully I can code/modify an example
> similar to what I am describing here and be able to test it.

Always sage advice.

> I believe I have to do a thread::wait on the thread monitoring the A&B
> tasks completion.

No, you do not ever (if structured properly) need to do a thread::wait.

> Once the thread receives the thread::send -async from the other threads it
> can check if they associated task is done. If both tasks are done submit
> to tpool if not done same status and wait again with thread::wait.

The -async flag is why you don't ever need to touch thread::wait. The
sender, when using -async, behaves in a non-blocking way. It sends off the
result, and then returns to doing something else. The Tcl event loop
handles the idle/active state of the receiving thread.

Your "join" thread would contain a proc like this (Note - untested code):

# this assumes "result_type" is the letter "A" or the letter "B"
proc join-results { identifier result_type result_value } {
global results_buffer

if { ! [ dict exists $results_buffer $identifier ] } {
# store the partial result in results_buffer and return to the event
# loop
dict set results_buffer $identifier $result_type $result_value
} else {
# should have a pair of results now - but first a sanity check
if { [ dict exists $results_buffer $identifier $result_type ] } {
# received two A's or two B's for a single identifier - do some
# error handling here
}

# store the newly received partial result in the dict - this is so the
# two results can be ordered properly by simply retreiving them both
# by name from the dict
dict set results_buffer $identifer $result_type $result_value

# this assumes that "compute-C" takes two parameters, A and B, in that order
set final_result [ compute-C [ dict get $results_buffer $identifier A ] \
[ dict get $results_buffer $identifier B ] ]

# send final result to the GUI thread
thread::send -async [ tsv::get thread_ids gui-thread ] [ list computation-complete $final_result ]

# clean-up the results_buffer
dict unset results_buffer $identifier
}
}

Then the A and B threads/pools would submit to the collector like so (in a non-blocking manner):

thread::send -async [ tsv::get thread_ids collector-thread ] [ list join-results $result_identifier $result_type $result_value ]

Frank

unread,
Mar 7, 2014, 12:27:49 PM3/7/14
to
Hi Rich and Christian,

I'll try to code something and post it here. Hopefully it will help others with TCL thread examples.

I am having internet issues therefore it may take some time until I can upload it.

What I am planning to do is:

1) Create a number of threads with the tpool. Capable of executing tasks A, B and C.

2) Create a merger thread to merge tasks A and B and add task C to the tpool.

3) Generate a scheduler with many tasks A and B with a random wait amount. Add them to the tpool. The threads will wait the wait amount before completing the tasks. Trying to model some randomness in the completion of the threads.

4) Have the threads start working on the tasks and have the threads with the A and B tasks send messages to the merger thread and have the merger thread add the task to the tpool.

5) The thread that completes the task C should send a message to the main thread indicating the completion. If I can I may make a simple text GUI and add the completions to it and have the main thread take care of updating it.


A little more information in my real application. Some of the tasks A and B can take 15-45 minutes to execute and work with 5+GB of data. Task C can take 10min and task D 1 minute or so. I could easily have 16 jobs that required the tasks A, B,C and D. The transfer of data from the threads to the main could be of a couple of GB depending on the amount of data processed by the task. All the data transferring is working fine. The problem was the main thread getting bugged down by the scheduling. The approach above seems to solve this issue as it moves the monitoring and scheduling of tasks to the child threads and not the main thread.

I really appreciate the feedback and help provided. This is something that has been bugging me for a while and I tried several approaches but always ended up with the main thread over tasked with no real purpose. The approach above looks very promising.

Regards,
Frank

Frank

unread,
Mar 9, 2014, 2:10:27 PM3/9/14
to
Hi,

Below is what I have so far. It is not working. It seems the initial tasks A and B are getting executed but I can not get it to send a message to the main task so it can update the GUI. The results TSV is not getting updated either. Therefore the threads may be failing to execute properly.

I also setup a proc to catch the thread errors and write them to a file. I am not getting any errors in the file. I am not sure if the tpool will see the thread errors proc too and use it if there is an error generated.



package require Thread
package require opt
package require Tk


#
#--- Defining the Thread error catch procedure and the thread error file.
#--- Thread error information will be written to the thread_errors.txt file.
#
proc threadError { thread_id errorInfo } {
set errorFile [open thread_errors.txt a]
puts $errorFile "Error in thread ID: $thread_id. \n\tError: $errorInfo\n"
close $errorFile
}

#--- Create the child threads for the tpool

::tcl::OptProc run_step_a_or_b {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
{-combine -string "" "Thread that schedules task C."}
{-main -string "" "Main thread."}
} {

thread::send -async $main "write_text \"Started execution of $task $step in thread [thread::id]\""

after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set initial [tsv::array get initial "$task,$step"]

#--- Execute the command sent
set result($task,$step) [expr $initial($task,$step) * 10.0]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]

#--- Signaling the merger thread that execution of step_a for this task has finished
thread::send -async $combine "collect_and_schedule -task $task -step $step"
}

::tcl::OptProc run_step_c {
{-task -string "" "Task identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
} {
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set result [tsv::array get result "$task,*"]

#--- Execute the command sent
set result($task,step_c) [expr $result($task,step_a) + $result($task,step_b)]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]

}


#--------------------
#--- Main Program ---
#--------------------

#--- Create the GUI (main thread)
#set wctop "."
#toplevel $wctop
wm title . "Multi-thread test"
set log [text .log \
-exportselection true \
-height 20 \
-borderwidth 4 \
-relief sunken \
-width 80
]

pack .log -fill both -expand 1 -side top

proc write_text {text} {global log; $log insert end "$text\n" ; $log see end ; update}

#--- Create the merger thread
#--- Defining the Thread error catch procedure.
#--- If a thread error is detected this procedure is called and the error information is written to a file.
write_text "Main thread ID: [thread::id]"
thread::errorproc threadError

set collect_and_schedule_thread_id [thread::create {
package require opt

::tcl::OptProc collect_and_schedule {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
} {
global status

set status($task,$step) 1

#--- Scehdule step_c if both steps a and b were executed for this task
if {[info exists status($task,step_a)] && [info exists status($task,step_b)]} {
tpool::post -detached -nowait [tpool::names] "run_step_c -task $task -wait 1000"
}
}
thread::wait
} ]
write_text "Collector thread ID: ${collect_and_schedule_thread_id}"

#--- Initialize the input data
set num_tasks 10
for {set i 1} {$i <= $num_tasks} {incr i} {
set initial(task_$i,step_a) $i
set initial(task_$i,step_b) [expr $i * 10.0]
}

#--- Making the initial data available to the threads via a TSV
#--- Trivial in this example. My app transfers big amount of data.

tsv::array set initial [array get initial]

write_text "Setup initial array data"

#--- Create the thread pool
set tpool_id [tpool::create -minworkers 5 -maxworkers 6]
write_text "Created tpool with tpool_id $tpool_id"


#--- Adding the tasks to the thread pool
set jobs ""
for {set i 1} {$i <= $num_tasks} {incr i} {
lappend jobs [tpool::post -nowait $tpool_id "run_step_a_or_b -task task_$i -step step_a -wait 1000 -combine $collect_and_schedule_thread_id -main [thread::id]"]
lappend jobs [tpool::post -nowait $tpool_id "run_step_a_or_b -task task_$i -step step_b -wait 1000 -combine $collect_and_schedule_thread_id -main [thread::id]"]
}

write_text "Added [llength $jobs] tasks to the tpool."
write_text "Setup done."
update

tpool::wait $tpool_id $jobs jobs_rem
write_text "Jobs remaining $jobs_rem"
while {[llength $jobs_rem] > 0} {
tpool::wait $tpool_id $jobs jobs_rem
write_text "In: $jobs_rem"
}

write_text "\n Initial: [tsv::array get initial]"

write_text "\n Results: [tsv::array get result]"



Rich

unread,
Mar 9, 2014, 2:31:21 PM3/9/14
to
Frank <kra...@gmail.com> wrote:

> #--- Create the thread pool
> set tpool_id [tpool::create -minworkers 5 -maxworkers 6]

man n tpool:

tpool::create ?options?
...
-initcmd script
Sets a Tcl script used to initialize new worker thread.
This is usually used to load packages and commands in the
worker, set default variables, create namespaces, and such.
If the passed script runs into a Tcl error, the worker will
not be created and the initiating command (either the
tpool::create or tpool::post) will throw error. Default
value for this option is unspecified, hence, the Tcl
interpreter of the worker thread will con- tain just the
initial set of Tcl commands.

Without an init command you have a pool of threads with nothing defined
in them. You need to add something to the -initcmd that sets up each
thread appropriately:

package require Thread

set tp [ tpool::create -minworkers 2 -maxworkers 4 -initcmd {
proc i-am-in-the-thread-pool { param } {
after [ expr { int( rand() * 500 ) } ]
puts stderr "Hello from thread [thread::id] param='$param' [clock clicks]"
}
} ]

for {set i 0} {$i < 10} {incr i} {
tpool::post -detached -nowait $tp [ list i-am-in-the-thread-pool $i ]
}

Result of pasting the above into a tclsh:
2.7.0
% % tpool0x242b7a0
% % % Hello from thread tid0x7f4340b4f700 param='1' 1394389798027979
Hello from thread tid0x7f4340b4f700 param='2' 1394389798115194
Hello from thread tid0x7f4340b4f700 param='3' 1394389798328501
Hello from thread tid0x7f433bfff700 param='0' 1394389798340290
Hello from thread tid0x7f4340b4f700 param='4' 1394389798364638
Hello from thread tid0x7f4340b4f700 param='6' 1394389798591951
Hello from thread tid0x7f433bfff700 param='5' 1394389798783891
Hello from thread tid0x7f4340b4f700 param='7' 1394389799082534
Hello from thread tid0x7f433bfff700 param='8' 1394389799129338
Hello from thread tid0x7f4340b4f700 param='9' 1394389799387932

Frank

unread,
Mar 10, 2014, 1:15:51 AM3/10/14
to
Here is an update. Threads are executing but step_c is not getting submitted.


package require Thread
package require opt
package require Tk


#
#--- Defining the Thread error catch procedure and the thread error file.
#--- Thread error information will be written to the thread_errors.txt file.
#
proc threadError { thread_id errorInfo } {
set errorFile [open thread_errors.txt a]
puts $errorFile "Error in thread ID: $thread_id. \n\tError: $errorInfo\n"
close $errorFile
}



#--------------------
#--- Main Program ---
#--------------------

#--- Create the GUI (main thread)
#set wctop "."
#toplevel $wctop
wm title . "Multi-thread test"
set log [text .log \
-exportselection true \
-height 50 \
-borderwidth 4 \
-relief sunken \
-width 80
]

pack .log -fill both -expand 1 -side top

proc write_text {text} {global log; $log insert end "$text\n" ; $log see end ; update}

#--- Create the merger thread
#--- Defining the Thread error catch procedure.
#--- If a thread error is detected this procedure is called and the error information is written to a file.
write_text "Main thread ID: [thread::id]"
thread::errorproc threadError

set collect_and_schedule_thread_id [thread::create {
package require opt
package require Thread

::tcl::OptProc collect_and_schedule {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
{-main -string "" "Main thread."}
} {
global status
thread::send -async $main "write_text \"Started collect_and_schedule. Task = $task and step = $step.\""
set status($task,$step) 1
#--- Scehdule step_c if both steps a and b were executed for this task
set tid 0
if {[info exists status($task,step_a)] && [info exists status($task,step_b)]} {
thread::send -async $main "write_text \"Check Tpool. Tpool name=[tpool::names].\""
set tid [tpool::post -nowait [tpool::names] "run_step_c -task $task -wait 1000 -main $main"]
thread::send -async $main "write_text \"Finished merging $task $step in thread [thread::id]. Added task $tid to tpool [tpool::names].\""
}
thread::send -async $main "write_text \"Finished collect_and_schedule. Task = $task and step = $step.\""
}
thread::wait
} ]
write_text "Collector thread ID: ${collect_and_schedule_thread_id}"

#--- Initialize the input data
set num_tasks 10
for {set i 1} {$i <= $num_tasks} {incr i} {
set initial(task_$i,step_a) $i
set initial(task_$i,step_b) [expr $i * 10.0]
}

#--- Making the initial data available to the threads via a TSV
#--- Trivial in this example. My app transfers big amount of data.

tsv::array set initial [array get initial]

write_text "Setup initial array data"

#--- Create the thread pool
set tpool_id [tpool::create -minworkers 5 -maxworkers 6 -initcmd {
#--- Create the child threads for the tpool
package require opt

::tcl::OptProc run_step_a_or_b {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
{-combine -string "" "Thread that schedules task C."}
{-main -string "" "Main thread."}
} {

thread::send -async $main "write_text \"Started execution of $task $step in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set initial [tsv::array get initial "$task,$step"]

#--- Execute the command sent
set result($task,$step) [expr $initial($task,$step) * 10.0]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]

#--- Signaling the merger thread that execution of step_a for this task has finished
thread::send $combine "collect_and_schedule -task $task -step $step -main $main"

thread::send -async $main "write_text \"Finished execution of $task $step in thread [thread::id]\""
}

::tcl::OptProc run_step_c {
{-task -string "" "Task identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
{-main -string "" "Main thread."}
} {
thread::send -async $main "write_text \"Started execution of $task step_c in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set result [tsv::array get result "$task,*"]

#--- Execute the command sent
set result($task,step_c) [expr $result($task,step_a) + $result($task,step_b)]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]
thread::send -async $main "write_text \"Finished execution of $task step_c in thread [thread::id]\""
}
}]
write_text "Created tpool with tpool_id $tpool_id"


#--- Adding the tasks to the thread pool
set jobs ""
for {set i 1} {$i <= $num_tasks} {incr i} {
lappend jobs [tpool::post -nowait [tpool::names] "run_step_a_or_b -task task_$i -step step_a -wait 1000 -combine $collect_and_schedule_thread_id -main [thread::id]"]
lappend jobs [tpool::post -nowait $tpool_id "run_step_a_or_b -task task_$i -step step_b -wait 1000 -combine $collect_and_schedule_thread_id -main [thread::id]"]
}

write_text "Added [llength $jobs] tasks to the tpool."
write_text "Setup done."
update

tpool::wait $tpool_id $jobs jobs_rem
write_text "Jobs remaining $jobs_rem"
while {[llength $jobs_rem] > 0} {
tpool::wait $tpool_id $jobs jobs_rem
#write_text "In: $jobs_rem"

Rich

unread,
Mar 10, 2014, 8:40:10 AM3/10/14
to
Frank <kra...@gmail.com> wrote:
> Here is an update. Threads are executing but step_c is not getting submitted.


> set collect_and_schedule_thread_id [thread::create {
> package require opt
> package require Thread
>
> ::tcl::OptProc collect_and_schedule {
> {-task -string "" "Task identifier."}
> {-step -choice "step_a step_b" "Step identifier."}
> {-main -string "" "Main thread."}
> } {
> global status
> thread::send -async $main "write_text \"Started collect_and_schedule. Task = $task and step = $step.\""
> set status($task,$step) 1
> #--- Scehdule step_c if both steps a and b were executed for this task
> set tid 0
> if {[info exists status($task,step_a)] && [info exists status($task,step_b)]} {
> thread::send -async $main "write_text \"Check Tpool. Tpool name=[tpool::names].\""
> set tid [tpool::post -nowait [tpool::names] "run_step_c -task $task -wait 1000 -main $main"]
> thread::send -async $main "write_text \"Finished merging $task $step in thread [thread::id]. Added task $tid to tpool [tpool::names].\""
> }
> thread::send -async $main "write_text \"Finished collect_and_schedule. Task = $task and step = $step.\""
> }
> thread::wait
> } ]

One item I see off the bat is that your variable "$main" is not defined
here. Curly brackets {} in Tcl prevent all variable substitutions. So
your collect_and_schedule proc inside the collector thread is not
receiving the contents of your "$main" variable from the GUI thread,
but is instead attempting to use an undefined variable.

The simplest thing for you to do is this in the main GUI thread:

tsv::set threads main [ thread::id ]

Then in your collect_and_schedule thread, instead of "$main" do:
[ tsv::get threads main ]


That is one item I see right away. Got to go otherwise right now.

Frank

unread,
Mar 10, 2014, 8:55:39 AM3/10/14
to
Hi Rich,

I can make the changes using a TSV to clean the code.

The value of main is coming as a parameter from the invocation of the proc. The two messages below are working. The


thread::send -async $main "write_text \"Started collect_and_schedule. Task = $task and step = $step.\""
thread::send -async $main "write_text \"Check Tpool. Tpool name=[tpool::names].\""

Here is when it seems that is getting stuck...

set tid [tpool::post -nowait [tpool::names] "run_step_c -task $task -wait 1000 -main $main"]

I never see the message below.

thread::send -async $main "write_text \"Finished merging $task $step in thread [thread::id]. Added task $tid to tpool [tpool::names].\""


Regards,
Frank

Frank

unread,
Mar 10, 2014, 10:12:41 AM3/10/14
to
Hi Rich,

I made the changes to clean the code.

What I found is that if I increase the number of threads it goes through. As seen in the line:
set tpool_id [tpool::create -minworkers 5 -maxworkers 50

But I am not sure why this is needed. Maybe some of the threads in the tpool are getting stuck (busy doing nothing) but based on the messages in the text widget they are leaving the procs successfully. Therefore I thought they will check in the pool for the next task.

There is nothing in the threadErrors file.

The code below is working with the exception of the problem described above.



package require Thread
package require opt
tsv::set threads main [thread::id]
thread::errorproc threadError

set collect_and_schedule_thread_id [thread::create {
package require opt
package require Thread

::tcl::OptProc collect_and_schedule {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
} {
global status
thread::send -async [tsv::get threads main] "write_text \"Started collect_and_schedule. Task = $task and step = $step.\""
set status($task,$step) 1
#--- Scehdule step_c if both steps a and b were executed for this task
set tid 0
if {[info exists status($task,step_a)] && [info exists status($task,step_b)]} {
thread::send -async [tsv::get threads main] "write_text \"Check Tpool. Tpool name=[tpool::names].\""
set tid [tpool::post -nowait [tpool::names] "run_step_c -task $task -wait 1000"]
thread::send -async [tsv::get threads main] "write_text \"Finished merging $task $step in thread [thread::id]. Added task $tid to tpool [tpool::names].\""
}
thread::send -async [tsv::get threads main] "write_text \"Finished collect_and_schedule. Task = $task and step = $step.\""
}
thread::wait
} ]
write_text "Collector thread ID: ${collect_and_schedule_thread_id}"

#--- Initialize the input data
set num_tasks 10
for {set i 1} {$i <= $num_tasks} {incr i} {
set initial(task_$i,step_a) $i
set initial(task_$i,step_b) [expr $i * 10.0]
}

#--- Making the initial data available to the threads via a TSV
#--- Trivial in this example. My app transfers big amount of data.

tsv::array set initial [array get initial]

write_text "Setup initial array data"

#--- Create the thread pool
set tpool_id [tpool::create -minworkers 5 -maxworkers 50 -initcmd {
#--- Create the child threads for the tpool
package require opt

::tcl::OptProc run_step_a_or_b {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
{-combine -string "" "Thread that schedules task C."}
} {

thread::send -async [tsv::get threads main] "write_text \"Started execution of $task $step in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set initial [tsv::array get initial "$task,$step"]

#--- Execute the command sent
set result($task,$step) [expr $initial($task,$step) * 10.0]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]

#--- Signaling the merger thread that execution of step_a for this task has finished
thread::send $combine "collect_and_schedule -task $task -step $step"

thread::send -async [tsv::get threads main] "write_text \"Finished execution of $task $step in thread [thread::id]\""
}

::tcl::OptProc run_step_c {
{-task -string "" "Task identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
} {
thread::send -async [tsv::get threads main] "write_text \"Started execution of $task step_c in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set result [tsv::array get result "$task,*"]

#--- Execute the command sent
set result($task,step_c) [expr $result($task,step_a) + $result($task,step_b)]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]
thread::send -async [tsv::get threads main] "write_text \"Finished execution of $task step_c in thread [thread::id]\""
}
}]
write_text "Created tpool with tpool_id $tpool_id"


#--- Adding the tasks to the thread pool
set jobs ""
for {set i 1} {$i <= $num_tasks} {incr i} {
lappend jobs [tpool::post -nowait [tpool::names] "run_step_a_or_b -task task_$i -step step_a -wait 1000 -combine $collect_and_schedule_thread_id"]
lappend jobs [tpool::post -nowait [tpool::names] "run_step_a_or_b -task task_$i -step step_b -wait 1000 -combine $collect_and_schedule_thread_id"]

Rich

unread,
Mar 10, 2014, 4:53:00 PM3/10/14
to
Frank <kra...@gmail.com> wrote:
> Hi Rich,

> I made the changes to clean the code.

> What I found is that if I increase the number of threads it goes
> through. As seen in the line: set tpool_id [tpool::create -minworkers
> 5 -maxworkers 50

> But I am not sure why this is needed. Maybe some of the threads in
> the tpool are getting stuck (busy doing nothing) but based on the
> messages in the text widget they are leaving the procs successfully.
> Therefore I thought they will check in the pool for the next task.

Under Linux, with anything smaller than -maxworkers 10 it appears to
enter an infinite loop.

Why it is doing this I do not yet understand.

Frank

unread,
Mar 10, 2014, 11:12:02 PM3/10/14
to
Hi Rich,

Thanks for giving it a shot.

In windows is the same thing with maxthreads of 11 it runs fine. With maxthreads of 10 it "hangs".

I looked at the Windows performance monitor and the CPU was not cranking. The TK GUI was also responsive. Therefore it almost looked like the threads stopped working and not looking at the tpool for jobs anymore. I added an exitcmd to the tpool:create to write a message if the idle worker thread exits and I also played with the -idletime parameter. I did not get any messages about idle workers exiting.

I am confused as it seems pretty straight forward. With tpool it is hard to get a list of the "pending" jobs. It almost seems the user has to keep track of them every time they are submitted.

I am also not getting anything from the thread:errorproc that I setup. It works fine with the thread::create workers but it does not work with the tpool::create workers.

Thanks again,
Frank


Frank

unread,
Mar 11, 2014, 12:25:12 AM3/11/14
to
Hi Rich,

The code below is now working.

I added the -async to the line below

thread::send -async $combine "collect_and_schedule -task $task -step $step"

It seems that this freed the worker. I believe it may be the thread::wait at the end of the collector that never told the worker the job was done and therefore it was not freed. If I remove the thread::wait from the collector it will finish right away and never get the requests from the child worker.

As you can see I am running 15 tasks with minworkers 3 and maxworkers 6.


Thanks again for your help and recommendations.

I believe I am now ready to start incorporating this into my app. :)

Regards,
Frank
#thread::send -async [tsv::get threads main] "write_text \"Check Tpool. Tpool name=[tpool::names].\""
set tid [tpool::post -nowait [tpool::names] "run_step_c -task $task -wait 1000"]
thread::send -async [tsv::get threads main] "write_text \"Finished merging $task $step in thread [thread::id]. Added task $tid to tpool [tpool::names].\""
}
thread::send -async [tsv::get threads main] "write_text \"Finished collect_and_schedule. Task = $task and step = $step.\""
}
thread::wait
} ]
write_text "Collector thread ID: ${collect_and_schedule_thread_id}"

#--- Initialize the input data
set num_tasks 15
for {set i 1} {$i <= $num_tasks} {incr i} {
set initial(task_$i,step_a) $i
set initial(task_$i,step_b) [expr $i * 10.0]
}

#--- Making the initial data available to the threads via a TSV
#--- Trivial in this example. My app transfers big amount of data.

tsv::array set initial [array get initial]

write_text "Setup initial array data"

#--- Create the thread pool
set tpool_id [tpool::create -minworkers 3 -maxworkers 6 -idletime 3600 -exitcmd finish -initcmd {
#--- Create the child threads for the tpool
package require opt

::tcl::OptProc run_step_a_or_b {
{-task -string "" "Task identifier."}
{-step -choice "step_a step_b" "Step identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
{-combine -string "" "Thread that schedules task C."}
} {

thread::send -async [tsv::get threads main] "write_text \"Started execution of $task $step in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set initial [tsv::array get initial "$task,$step"]

#--- Execute the command sent
set result($task,$step) [expr $initial($task,$step) * 10.0]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]

#--- Signaling the merger thread that execution of step_a for this task has finished
thread::send -async $combine "collect_and_schedule -task $task -step $step"

thread::send -async [tsv::get threads main] "write_text \"Finished execution of $task $step in thread [thread::id]\""
return
}

::tcl::OptProc run_step_c {
{-task -string "" "Task identifier."}
{-wait -string "" "Wait period before finishing command in miliseconds."}
} {
thread::send -async [tsv::get threads main] "write_text \"Started execution of $task step_c in thread [thread::id]\""
after $wait

#--- Transfering the data associated with step and task from the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
array set result [tsv::array get result "$task,*"]

#--- Execute the command sent
set result($task,step_c) [expr $result($task,step_a) + $result($task,step_b)]

#--- Trasnfering the results of processing step_a to the main thread.
#--- Trivial for this example but in my app large amounts are passed between threads
tsv::array set result [array get result]
thread::send -async [tsv::get threads main] "write_text \"Finished execution of $task step_c in thread [thread::id]\""
return
}

proc finish {} {thread::send -async [tsv::get threads main] "write_text \"Thread exiting [thread::id]\""}
}]
write_text "Created tpool with tpool_id $tpool_id"


#--- Adding the tasks to the thread pool
set jobs ""
for {set i 1} {$i <= $num_tasks} {incr i} {
lappend jobs [tpool::post -nowait [tpool::names] "run_step_a_or_b -task task_$i -step step_a -wait 1000 -combine $collect_and_schedule_thread_id"]
lappend jobs [tpool::post -nowait [tpool::names] "run_step_a_or_b -task task_$i -step step_b -wait 1000 -combine $collect_and_schedule_thread_id"]
}

write_text "Added [llength $jobs] tasks to the tpool."
write_text "Setup done."
update

tpool::wait $tpool_id $jobs jobs_rem
write_text "Jobs remaining $jobs_rem"
while {[llength $jobs_rem] > 0} {
tpool::wait $tpool_id $jobs jobs_rem
#write_text "\n\nIn: $jobs_rem\n\n"
0 new messages