Google Groepen ondersteunt geen nieuwe Usenet-berichten of -abonnementen meer. Historische content blijft zichtbaar.

issues getting results from parallel threads (thread::send -async)

48 weergaven
Naar het eerste ongelezen bericht

Josef Frank

ongelezen,
27 jun 2015, 15:04:2427-06-2015
aan

When trying to get results from scripts sent to parallel threads in
asynchronous manner, I've problems getting the actual results (just
doing my 1st steps in multithreading):

I try to send text lines to different threads for parallel processing.
Lines should eventually be output by main thread in original order.

I'm sending the scripts to threads with the "-async" flag. Now the
problem is getting back results when 1st of two lines needs more
processing time than 2nd:

thread::send -async $handle_thread1 {heavy_work} result_thread1
thread::send -async $handle_thread2 {less_heavy_work} result_thread2

then if I only do

vwait result_thread2
puts $result_thread1
puts $result_thread2

It happens that result_thread1 is still not set and thus empty at this
point when thread1 needs more time than thread2.

OTOH If I'm doing

vwait result_thread1
vwait result_thread2
puts $result_thread1
puts $result_thread2

the program gets locked at the 2nd vwait if result_thread2 is set before
result_thread1. How would I best solve this?

Best
Josef


Gerald W. Lester

ongelezen,
27 jun 2015, 16:53:3327-06-2015
aan
Read the section of the thread man/help page on "thread::cond"


--
+------------------------------------------------------------------------+
| Gerald W. Lester, President, KNG Consulting LLC |
| Email: Gerald...@kng-consulting.net |
+------------------------------------------------------------------------+

Alexandre Ferrieux

ongelezen,
28 jun 2015, 13:15:3428-06-2015
aan
On Saturday, June 27, 2015 at 9:04:24 PM UTC+2, Josef Frank wrote:
>
> thread::send -async $handle_thread1 {heavy_work} result_thread1
> thread::send -async $handle_thread2 {less_heavy_work} result_thread2
>
> vwait result_thread2
> puts $result_thread1
> puts $result_thread2
>
> [dependent on completion order]

This highlights the fact that using the optional variable argument in [thread::send] is only fit for the simplest cases.
To work around this, don't use the variable argument, but arrange for both of your async threads to [thread::send] back to the main thread something that will append its result to a common variable:

thread::send -async $handle_thread1 {set res [heavy_work];thread::send $main [list lappend ::result TH1 $res]}

thread::send -async $handle_thread2 {set res [less_heavy_work];thread::send $main [list lappend ::result TH2 $res]}

while {1} {
vwait ::result
if {[llength $::result]>=4} break
}

-Alex

Josef Frank

ongelezen,
29 jun 2015, 19:31:0729-06-2015
aan

You made my day!

Thanks to both of you.

Fiddling around with mutexes on my own had not obtained the desired
result, but using condition variables did.

> This highlights the fact that using the optional variable argument in [thread::send] is only fit for the simplest cases.
> To work around this, don't use the variable argument, but arrange for both of your async threads to [thread::send] back to the main thread something that will append its result to a common variable:
>
> thread::send -async $handle_thread1 {set res [heavy_work];thread::send $main [list lappend ::result TH1 $res]}
>
> thread::send -async $handle_thread2 {set res [less_heavy_work];thread::send $main [list lappend ::result TH2 $res]}

After some trials processing the lines in parallel in batches of fixed
numbers I realized that I wasted time waiting for completion of batches.

So I finally followed Alex' suggestion in some kind: I created an
additional output thread that gets sent the processed lines directly as
they become available.

Amazingly the resulting pipeline (reader-filter-writer) outperforms the
single threaded serial solution (reading, processing and writing back
line by line) hands down, even if only one filter thread is running;
with two or three filters in parallel it finally saves about 40% of
original time. The batch oriented version in contrast needs at least two
parallel filters to gain speed at all, as it otherwise just introduces
additional overhead that results in an actual slowdown of processing.

Josef

0 nieuwe berichten