Threads: what is the most efficient way to send a largish amount of data to another thread?

Arjen Markus

unread,

Feb 7, 2011, 4:28:04 AM2/7/11

to

I was experimenting a bit with the threads package and I came to
wonder about the above.
As I send scripts containing data from one thread to another, I began
to wonder if that is the
most efficient way.

The idea is:
Two or more threads run a computation and when that is finished they
have to exchange results.
Only when all the data have been exchanged, they can continue with the
next step. That amount
of data might be large.

Can anyone advise on the best way to do this. I now use ::thread::send
and that works fine,
but in this experiment I just pass a single value.

Regards,

Arjen

Donal K. Fellows

unread,

Feb 7, 2011, 4:35:37 AM2/7/11

to

On Feb 7, 9:28 am, Arjen Markus <arjen.markus...@gmail.com> wrote:
> Two or more threads run a computation and when that is finished they
> have to exchange results. Only when all the data have been exchanged,
> they can continue with the next step. That amount of data might be
> large.
>
> Can anyone advise on the best way to do this. I now use ::thread::send
> and that works fine, but in this experiment I just pass a single
> value.

You might want to use the shared variable stuff, and just pass the
name of the shared variable in your messages. (I don't know how
efficient this is for large data in comparison to direct messaging;
measure it...)

Donal.

Uwe Klein

unread,

Feb 7, 2011, 5:06:21 AM2/7/11

to

Arjen Markus wrote:
> I was experimenting a bit with the threads package and I came to
> wonder about the above.
> As I send scripts containing data from one thread to another, I began
> to wonder if that is the
> most efficient way.
>
> The idea is:
> Two or more threads run a computation and when that is finished they
> have to exchange results.
> Only when all the data have been exchanged, they can continue with the
> next step. That amount
> of data might be large.

I wrote a largish distributed intrument control and data collection/processing
suite once. ( before I ever knew about expect and tcl )
communication was via pipes and a message multiplexer and a neanderthal expect
for processing control.
Data sharing via SHM files. And a bunch of processes working on the data.
This worked quite well and could be easily extended.

i.e.
Beam data into SHMA, report ready over pipe.
independently Beam data into SHMB, report ready over pipe.
Wait for A and B ready, kick of processing SHMA/B data and store into SHMC
kick the recording process to store another block from SHMC.

uwe

Alexandre Ferrieux

unread,

Feb 7, 2011, 5:47:09 AM2/7/11

to

On 7 fév, 10:35, "Donal K. Fellows" <donal.k.fell...@manchester.ac.uk>
wrote:

Anything wrong with [::thread::send $tid [list foo $value]] ? Doesn't
this respect the internal reps, and simply hand the Tcl_Obj's over to
the dest thread+interp ?

-Alex

Arjen Markus

unread,

Feb 7, 2011, 7:21:29 AM2/7/11

to

On 7 feb, 11:47, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

> -Alex- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

I simply don't know - I do not know if it will expand the script into
a
string and then interpret that again. It would boil down I guess to:

::thread::send $tid [list set data $list_of_ten_thousand_values]

versus

::thread::send $tid "set data {$list_of_ten_thousand_values}"

I guess measuring is what is called for ;)

Regards,

Arjen

Alexandre Ferrieux

unread,

Feb 7, 2011, 3:00:28 PM2/7/11

to

Indeed. I finally put my hands on a threaded Tcl, a found that indeed
thread::send forces a string-only path:

% thread::create
tid0xb6c05b70
% thread::send tid0xb6c05b70 {proc r s {puts
[::tcl::unsupported::representation $s]}}
% set x [list a z e r];puts [::tcl::unsupported::representation
$x];thread::send tid0xb6c05b70 [list r $x[unset x]]
value is a list with a refcount of 2, object pointer at 0x863a5e0,
internal representation 0x8663e98:(nil), no string representation.
value is a pure string with a refcount of 3, object pointer at
0xb6110928, string representation "a z e r".

Note that this test also shows that even unshared values (K trick) get
pure-stringified (and duplicated).
So it looks like it's not a good idea to use thread::send for big
values ;)

-Alex

Alexandre Ferrieux

unread,

Feb 7, 2011, 3:24:20 PM2/7/11

to

On Feb 7, 9:00 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>

Note that, in hindsight, this pure string path is a necessary evil in
the context of the apartment model: no value should be shared between
threads. It's just that, as a special case, a deeply-unshared object
*could* be passed more efficiently. Low prio...

-Alex

blacksqr

unread,

Feb 8, 2011, 1:22:01 AM2/8/11

to

I recently read that SQLite uses shared memory to allow multiple
processes to access a single database simultaneously without
duplication of RAM resources. Sorry I don't have a lot of detailed
technical insight, but might be possible for your threads each to open
a connection to a SQLite database and exchange data via reads and
writes thereto.

--SEH

Arjen Markus

unread,

Feb 8, 2011, 6:41:01 AM2/8/11

to

On 7 feb, 21:24, Alexandre Ferrieux <alexandre.ferri...@gmail.com>

Then using the tsv facility might solve such (anticipated, not
observed) performance issues.

Regards,

Arjen

Arjen Markus

unread,

Feb 8, 2011, 6:43:26 AM2/8/11

to

Yet another way to solve this potential issue ;). I will keep this
in mind. For the moment getting to know the Threads extension is
my first priority - my performance question was part of that.

Regards,

Arjen

Andreas Kupries

unread,

Feb 8, 2011, 11:22:20 PM2/8/11

to

Alexandre Ferrieux <alexandre...@gmail.com> writes:

AFAIK No.

Tcl_Obj's are bound to their thread (per-thread memory allocation) and
cannot be handed over in such a manner. IIRC there is even a Tcl_Panic
deep in the tclObj.c which is triggered when it detects that a Tcl_Obj
is freed by a different thread than it was allocated by.

Thus it goes through the string representation and creates a new
Tcl_Obj in the destination thread. Tripling memory usage.

> -Alex

--
So long,
Andreas Kupries <akup...@shaw.ca>
<http://www.purl.org/NET/akupries/>
Developer @ <http://www.activestate.com/>
-------------------------------------------------------------------------------

Andreas Leitgeb

unread,

Feb 9, 2011, 8:10:10 AM2/9/11

to

blacksqr <stephen...@alum.mit.edu> wrote:
> I recently read that SQLite uses shared memory to allow multiple
> processes to access a single database simultaneously without
> duplication of RAM resources.

I doubt, that an tcl-app could send tcl objects to sqlite without
serializing them to strings first. But iirc it's the serializing
that the OP wanted to avoid.