Shared vs non shared PMC timings

Leopold Toetsch

unread,

Jan 20, 2004, 11:48:19 AM1/20/04

to P6I

To estimate the costs of shared PMCs I have run this program[1], once
with .Ref and once with .SharedRef.

There are 2 major issues:
1) PMC initialization (new_p_ic_p): The shared PMC needs additionally
the allocation of the synchronize structure and the MUTEX_INIT.
2) PMC access (set_p_i): locking/unlocking the mutex

Here are snippets from the profile:

with SharedRef
CODE OP FULL NAME CALLS TOTAL TIME AVG T. ms
---- ----------------- ------- ---------- ----------
753 new_p_ic_p 100000 0.157785 0.0016
905 set_p_i 100000 0.049269 0.0005

with Ref
CODE OP FULL NAME CALLS TOTAL TIME AVG T. ms
---- ----------------- ------- ---------- ----------
753 new_p_ic_p 100000 0.051330 0.0005
905 set_p_i 100000 0.011356 0.0001

(Overall timings aren't really comparable, the SharedRef also does a
LOCK for mark, which slows that down as well)

Linux 2.2.16, Athlon 800, unoptimized Parrot build.
leo

[1]
set I0, 100000
set I1, 0
lp:
new P0, .PerlInt
new P1, .Ref, P0 # or .SharedRef
set P1, I1
inc I1
lt I1, I0, lp
end

Gordon Henriksen

unread,

Jan 20, 2004, 11:56:47 AM1/20/04

to Leopold Toetsch, P6I

Leopold Toetsch wrote:

> (Overall timings aren't really comparable, the SharedRef also does a
> LOCK for mark, which slows that down as well)

?? Why'd you do that? Competetive threads MUST be suspended (most likely
with their cooperation, not with an OS suspend call) during the mark
phase.

--

Gordon Henriksen
IT Manager
ICLUBcentral Inc.
gor...@iclub.com

Dan Sugalski

unread,

Jan 20, 2004, 12:05:18 PM1/20/04

to Leopold Toetsch, P6I

At 5:48 PM +0100 1/20/04, Leopold Toetsch wrote:
>To estimate the costs of shared PMCs I have run this program[1],
>once with .Ref and once with .SharedRef.
>
>There are 2 major issues:
>1) PMC initialization (new_p_ic_p): The shared PMC needs
>additionally the allocation of the synchronize structure and the
>MUTEX_INIT.

This is a one-time cost. If a PMC has one, it should stick around
after the PMC is destroyed and put on the free list.

>2) PMC access (set_p_i): locking/unlocking the mutex

>with SharedRef

> 905 set_p_i 100000 0.049269 0.0005
>
>with Ref

> 905 set_p_i 100000 0.011356 0.0001

Yeah, that's about right. There is, unfortunately, little to be done about it.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,

Jan 20, 2004, 12:47:35 PM1/20/04

to Gordon Henriksen, perl6-i...@perl.org

Gordon Henriksen <mali...@mac.com> wrote:
> Leopold Toetsch wrote:

>> (Overall timings aren't really comparable, the SharedRef also does a
>> LOCK for mark, which slows that down as well)

> ?? Why'd you do that?

I didn't do it :) Pmc2c.pm is too dumb, it just puts a LOCK/UNLOCK pair
around each vtable mthod, and no one is telling it that there are
exceptions ;)

leo

Leopold Toetsch

unread,

Jan 20, 2004, 12:49:36 PM1/20/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 5:48 PM +0100 1/20/04, Leopold Toetsch wrote:
>>1) PMC initialization (new_p_ic_p): The shared PMC needs
>>additionally the allocation of the synchronize structure and the
>>MUTEX_INIT.

> This is a one-time cost. If a PMC has one, it should stick around
> after the PMC is destroyed and put on the free list.

Ah yep. Good idea.

leo

Dan Sugalski

unread,

Jan 20, 2004, 1:59:19 PM1/20/04

to l...@toetsch.at, perl6-i...@perl.org

I'll admit, I'm tempted to make it part of the arena initialization
cost -- when a new PMC arena is allocated because we're out of
headers we just unconditionally give 'em all sync structs if we're
running threaded.

The only worry there is resource exhaustion. Rumor has it that some
systems have a limited number of mutexes, but I've never actually
seen one of them.

Leopold Toetsch

unread,

Jan 24, 2004, 9:44:36 AM1/24/04

to perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> wrote:

> Here are snippets from the profile:

> with SharedRef
> CODE OP FULL NAME CALLS TOTAL TIME AVG T. ms
> ---- ----------------- ------- ---------- ----------
> 753 new_p_ic_p 100000 0.157785 0.0016
> 905 set_p_i 100000 0.049269 0.0005

> with Ref
> CODE OP FULL NAME CALLS TOTAL TIME AVG T. ms
> ---- ----------------- ------- ---------- ----------
> 753 new_p_ic_p 100000 0.051330 0.0005
> 905 set_p_i 100000 0.011356 0.0001

A have redefined the locking macros to use a rwlock in SharedRef:

CODE OP FULL NAME CALLS TOTAL TIME AVG T. ms
---- ----------------- ------- ---------- ----------

753 new_p_ic_p 100000 0.168588 0.0017
-3 DOD 212 0.106973 0.5046
905 set_p_i 100000 0.087742 0.0009

So pthread_rwlock_rdlock() is about 100% slower then pthread_mutex_lock.
Acquiring a rwlock seems only be reasonable for lengthy operations like
sorting an array or such.

leo