Re: [perl-compiler] Re: av_extend speed

25 views
Skip to first unread message

J. Nick Koston

unread,
Jun 29, 2010, 1:29:05 PM6/29/10
to perl-c...@googlegroups.com
I tried to allocate them all in one chunk, however when av_extend happens its going to try to free the big chunk (perl_destruct would fail as well). -fav-init was definitely slower.

AV_ALL_AT_ONCE.diff

J. Nick Koston

unread,
Jun 29, 2010, 2:24:34 PM6/29/10
to perl-c...@googlegroups.com
if ($PERL510) {
$init->add(sprintf("\tNewx(svp, %d, SV*);", $fill < 3 ? 3 : $fill+1),
"\tAvALLOC(av) = svp;",
"\tAvARRAY(av) = svp;");
} else { # read-only AvARRAY macro
$init->add(sprintf("\tNewz(0, svp, %d, SV*);", $fill < 3 ? 3 : $fill+1),
"\tAvALLOC(av) = svp;",
# XXX Dirty hack from av.c:Perl_av_extend()
"\tSvPVX(av) = (char*)svp;");
}

I wonder if Newz is needed for perl 5.8/5.6 as av_extend just uses New. This will remove the expensive memset overhead.

On Jun 29, 2010, at 12:21 PM, Reini Urban wrote:

> 2010/6/29 J. Nick Koston <ni...@cpanel.net>:
>> After doing some testing it is really overhead of safesysmalloc
>> that is taking all the time and not av_extend itself. I am pretty
>> sure it is just making the AVs for the PADs.
>
> Thanks, safesysmalloc is slow, yes.
> -Dusemymalloc is even slower.
>
> In my tests my new -fav-init switch was slower than without.
>
> I can think of a schme where I collect all static arrays,
> alloc a big chunk, and then set all the av pointers right.
> That's what compiler can do. Esp. for pad's, even if the pads
> have to be extended at run-time.
>
> Only free is a problem, but I was thinking of a much faster DESTROY
> phase anyway.
> Only if the compilation is done for a shared lib, or if embedded
> we have to run though the complete sv and op destruction traversal.
> For a simple elf or PE executable the OS does it much faster, esp. if
> it's in the main executable.
>
>> {
>> SV **svp;
>> AV *av = (AV*)&sv_list[7];
>> av_extend(av, 0);
>> svp = AvARRAY(av);
>> *svp++ = (SV*)&PL_sv_undef;
>> AvFILLp(av) = 0;
>> }
>> av_extend((AV*)&sv_list[9], 3);
>> {
>> SV **svp;
>> AV *av = (AV*)&sv_list[8];
>> av_extend(av, 0);
>> svp = AvARRAY(av);
>> *svp++ = (SV*)(AV*)&sv_list[9];
>> AvFILLp(av) = 0;
>> }
>> {
>> SV **svp;
>> AV *av = (AV*)&sv_list[6];
>> av_extend(av, 1);
>> svp = AvARRAY(av);
>> *svp++ = (SV*)(AV*)&sv_list[7];
>> *svp++ = (SV*)(AV*)&sv_list[8];
>> AvFILLp(av) = 1;
>> }
>> CvPADLIST(&sv_list[5]) = (AV*)&sv_list[6];
>>
>> I have no idea if this is useful information or not.
>>
>>
>
>
>
> --
> Reini Urban
> http://phpwiki.org/ http://murbreak.at/
>
> --
> Unsubscribe via mail to perl-compile...@googlegroups.com
> Options: http://groups.google.com/group/perl-compiler?hl=en
>

--
J. Nick Koston

cPanel Inc
ni...@cpanel.net
Office: 7135290800
Mobile: 7133829333

3131 W Alabama St
STE 100 BOX 30
Houston, TX 77098

Reini Urban

unread,
Jun 29, 2010, 4:06:46 PM6/29/10
to perl-c...@googlegroups.com
2010/6/29 J. Nick Koston <ni...@cpanel.net>:
>        if ($PERL510) {
>          $init->add(sprintf("\tNewx(svp, %d, SV*);", $fill < 3 ? 3 : $fill+1),
>                     "\tAvALLOC(av) = svp;",
>                     "\tAvARRAY(av) = svp;");
>        } else { # read-only AvARRAY macro
>          $init->add(sprintf("\tNewz(0, svp, %d, SV*);", $fill < 3 ? 3 : $fill+1),
>                     "\tAvALLOC(av) = svp;",
>                     # XXX Dirty hack from av.c:Perl_av_extend()
>                     "\tSvPVX(av) = (char*)svp;");
>        }
>
> I wonder if Newz is needed for perl 5.8/5.6 as av_extend just uses New. This will remove the expensive memset overhead.

I came to the same idea, (without reading your email) and committed
the obvious improvement, just for >= 5.10 now.
Only if -Dusemymalloc is not set.

5.8 also?

- $init->add(sprintf("\tNewx(svp, %d, SV*);", $fill < 3 ? 3 : $fill+1),
+ $init->add(sprintf(($MYMALLOC
+ ? "\tNewx(svp, %d, SV*);"
+ : "\tsvp = (SV*)malloc(%d * sizeof(SV*));"),

J. Nick Koston

unread,
Jun 29, 2010, 4:27:43 PM6/29/10
to perl-c...@googlegroups.com
perl 5.6.2 does not have Newx
#define New(x,v,n,t) (v = (t*)safemalloc((MEM_SIZE)((n)*sizeof(t))))

5.8.x does
#define Newx(v,n,t) (v = (MEM_WRAP_CHECK_(n,t) (t*)safemalloc((MEM_SIZE)((n)*sizeof(t)))))

New (instead of Newz) seems to work just fine for perl 5.6.2 (and it matches the av_extend behavior).
Newx is used in av_extend on 5.8 so that seems like the way to go.

Reini Urban

unread,
Jun 29, 2010, 4:58:38 PM6/29/10
to perl-c...@googlegroups.com
2010/6/29 J. Nick Koston <ni...@cpanel.net>:
> perl 5.6.2 does not have Newx
> #define New(x,v,n,t)    (v = (t*)safemalloc((MEM_SIZE)((n)*sizeof(t))))
>
> 5.8.x does
> #define Newx(v,n,t)     (v = (MEM_WRAP_CHECK_(n,t) (t*)safemalloc((MEM_SIZE)((n)*sizeof(t)))))
>
> New (instead of Newz) seems to work just fine for perl 5.6.2 (and it matches the av_extend behavior).
> Newx is used in av_extend on 5.8 so that seems like the way to go.

Oops.
I had this Newx problem before, that's why I added
/* Since 5.8.8 */
#ifndef Newx
#define Newx(v,n,t) New(0,v,n,t)
#endif
to the head of each generated .c

Fixed in svn. Fast malloc is now used for av's and
ltrace -c looks much better now here.

Hashes later also, but we have no -fhv-init switch yet.
Hashes should really be faster initialized also.

:readonly user-attributes (and compiler sv flags) should be honored.

J. Nick Koston

unread,
Jun 29, 2010, 7:40:51 PM6/29/10
to perl-c...@googlegroups.com
This resulted in about 1.5-1.6% speed up. I think we could make some real significant gains by using independent_comalloc to allocate all the AV memory at once.

J. Nick Koston

unread,
Jul 3, 2010, 2:42:14 AM7/3/10
to perl-c...@googlegroups.com
I managed to get an 18% startup time decrease by linking with ptmalloc3 and using independent_comalloc to allocate the memory for all the AVs at once.

static int perl_init()
{
size_t sizes[930] = {48,48,48,64,48,1136,1136,64,48,48,48,64,48,1056,1056,48,1056,1056,48,64,64,64,48,112,112,64,48,368,368,64,48,816,816,64,48,48,48,64,48,64,64,64,48,48,48,64,48,304,304,64,48,128,128,64,48,176,176,64,48,64,64,64,48,2464,2464,368,48,80,80,64,48,96,96,64,2016,64,48,48,48,48,48,640,640,64,48,48,48,64,304,48,320,320,64,48,96,96,64,48,752,752,64,48,48,48,64,48,48,80,80,64,48,240,240,64,48,128,128,64,48,64,64,64,48,48,48,64,48,96,96,64,48,80,80,64,48,48,80,80,64,48,48,160,160,64,48,64,64,64,48,160,160,64,48,304,304,64,48,128,128,64,48,96,96,64,48,80,80,64,48,112,112,64,48,144,144,64,48,208,208,64,48,160,160,64,272,48,80,80,64,48,48,48,64,48,96,96,64,48,128,128,64,48,48,48,64,48,112,112,64,48,128,128,64,48,112,112,64,48,64,64,64,48,368,368,64,48,96,96,64,48,256,256,64,48,48,48,64,48,128,128,64,48,48,48,64,48,48,48,64,48,48,320,320,64,48,48,48,48,48,48,160,48,144,144,64,48,160,160,64,48,176,176,64,48,128,128,64,48,48,48,64,48,192,192,64,48,112,112,64,48,96,96,64,48,960,960,64,48,720,720,64,48,48,1392,1392,64,48,160,160,64,48,112,112,64,48,176,176,64,48,176,176,64,48,608,608,64,48,48,48,64,48,176,176,64,48,656,656,64,48,176,176,64,48,256,256,64,48,80,80,64,48,48,48,64,48,320,320,64,48,48,48,64,48,48,48,64,48,448,448,64,48,48,48,64,48,176,176,64,48,128,128,64,48,304,304,64,112,192,48,240,240,64,48,272,272,64,112,48,240,240,64,48,240,240,64,48,400,400,64,48,48,48,48,64,48,96,96,64,48,48,48,64,48,48,48,64,48,80,80,64,48,96,96,64,48,432,432,64,48,80,80,64,48,80,80,64,48,80,80,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,80,80,64,48,48,48,64,48,80,80,64,48,208,208,64,48,112,112,64,48,48,48,64,48,48,288,288,64,48,384,384,64,48,176,176,64,48,48,48,64,48,64,64,64,48,352,352,64,48,64,64,64,48,64,64,64,48,48,48,64,48,48,48,64,48,48,48,64,48,352,352,64,48,64,64,64,48,48,48,48,64,64,48,48,48,48,64,48,80,80,80,48,1376,1376,64,64,48,64,64,64,48,48,48,64,48,2272,2272,64,64,48,144,144,64,48,64,64,64,48,48,48,48,64,48,64,64,64,48,64,64,64,48,64,64,64,48,64,64,64,48,80,80,64,1088,48,64,64,64,48,64,64,64,944,48,64,64,64,640,64,48,192,48,64,64,64,48,336,336,64,48,48,48,48,176,176,64,48,64,64,64,48,48,48,64,48,48,48,64,48,416,416,64,48,48,48,64,48,48,48,64,48,48,48,64,48,64,64,64,48,816,816,64,48,64,64,64,48,208,208,64,48,64,64,64,48,64,64,64,48,48,48,64,48,48,48,64,48,64,64,64,48,64,64,64,48,128,128,64,48,64,64,64,48,48,48,64,48,704,704,64,48,368,368,64,48,48,48,48,64,48,48,48,64,48,64,64,64,48,224,48,64,64,64,48,240,240,64,48,48,48,64,48,48,48,48,48,48,64,48,80,80,64,48,64,64,64,48,208,208,64,48,48,48,64,48,128,128,64,48,96,96,64,48,48,48,64,48,128,128,64,48,80,80,64,48,112,112,64,416,48,112,112,64,48,176,176,64,48,48,48,64,48,80,80,64,48,144,144,64,48,128,128,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,48,48,64,48,544,48,128,128,64,48,848,848,64,64,48,48,48,48,48,64,64,64,48,64,64,64,864,928};
void* chunks[930];
if (independent_comalloc( 930, sizes, chunks ) == 0) { die(aTHX_ "panic: AV alloc failed"); };

...

Then the Av saves look like

{
SV **svp = chunks[1];
AV *av = (AV*)&sv_list[5];
AvALLOC(av) = svp;
SvPVX(av) = (char*)svp;
*svp++ = (SV*)&PL_sv_undef;
*svp++ = (SV*)&sv_list[6];
*svp++ = (SV*)&sv_list[7];
}
{
SV **svp = chunks[3];
AV *av = (AV*)&sv_list[9];
AvALLOC(av) = svp;
SvPVX(av) = (char*)svp;
}

Reini Urban

unread,
Jul 6, 2010, 6:21:35 PM7/6/10
to perl-c...@googlegroups.com
2010/7/3 J. Nick Koston <ni...@cpanel.net>:

> I managed to get an 18% startup time decrease by linking with ptmalloc3 and using independent_comalloc to allocate the memory for all the AVs at once.

18% is worth some effort :)

> static int perl_init()
> {
> size_t sizes[930] = {...64,64,48,64,64,64,864,928};


> void* chunks[930];
> if (independent_comalloc( 930, sizes, chunks ) == 0) { die(aTHX_ "panic: AV alloc failed"); };
>
> ...
>
> Then the Av saves look like
>
>    {
>        SV **svp = chunks[1];
>        AV *av = (AV*)&sv_list[5];
>        AvALLOC(av) = svp;
>        SvPVX(av) = (char*)svp;
>        *svp++ = (SV*)&PL_sv_undef;
>        *svp++ = (SV*)&sv_list[6];
>        *svp++ = (SV*)&sv_list[7];
>    }
>    {
>        SV **svp = chunks[3];
>        AV *av = (AV*)&sv_list[9];
>        AvALLOC(av) = svp;
>        SvPVX(av) = (char*)svp;
>    }

Attached is my idea to check for ptmalloc3 and independent_comalloc,
and use it from c and perl.

Can you check this and add your C.pm changes to write the chunks?
--
Reini Urban
http://phpwiki.org/ http://murbreak.at/

ptmalloc3.diff

J. Nick Koston

unread,
Jul 6, 2010, 7:10:39 PM7/6/10
to perl-c...@googlegroups.com
Will do when I have some free time this weekend. I will also clean up the other few optimizations and send them to the list soon.

PS I did get a hard coded implementation working, however I abandoned it due to the increase in memory usage not being worth it for my application. That will not be the case for others however.

ptmalloc3.diff

Reini Urban

unread,
Jul 7, 2010, 5:08:06 PM7/7/10
to perl-c...@googlegroups.com
On 7 Jul., 01:10, "J. Nick Koston" <n...@cpanel.net> wrote:
> Will do when I have some free time this weekend.    I will also clean up the other few optimizations and send them to the list soon.
>
> PS I did get a hard coded implementation working, however I abandoned it due to the increase in memory usage not being worth it for my application.  That will not be the case for others however.
>
> > 2010/7/3 J. Nick Koston <n...@cpanel.net>:

> >> I managed to get an 18% startup time decrease by linking with ptmalloc3 and using independent_comalloc to allocate the memory for all the AVs at once.
>
> > 18% is worth some effort :)

It was quite some effort, but it works fine now.
The code shoul also be in the branch independent_comalloc,
if googlecode would finally allow my commit - 500 Internal Server Error.

No regressions and much faster.

And!
We have now a possibility to add/override user-defined cflags
and libs to all compiler invocations by editing B::C::Flags

The only question is the command-line API:
-O2 for -fav-init2
-O1 for the old -fav-init
With -fav-init2 and no independent_comalloc, -fav-init is used.

--
Reini

independent_comalloc.patch
Reply all
Reply to author
Forward
0 new messages