Re: new block allocator

Chris Angelico

unread,

Jun 16, 2012, 3:51:44 AM6/16/12

to Pike mailinglist

On Tue, Apr 3, 2012 at 7:16 AM, Arne Goedeke <e...@laramies.com> wrote:
> We pushed an experimental branch which uses a new block allocator we have
> been developing. This one uses a different algorithm than the one we briefly
> mentioned on the conference. The new one uses a hashtable for page lookup
> instead of radix trees.
>
> The branch can be found under arne/block_alloc, a GJAlloc-bundle is
> available at http://fortranland.com/GJAlloc-0.08.tar.gz or from
> https://github.com/arneg/GJAlloc via git. The branch currently contains alot
> of obsolete history, which we would squash into one commit in case of
> inclusion.

Apologies for the thread necromancy, but I'm playing around with this
new allocator and can't get it to build. I cloned GJAlloc from github,
managed to get that to build, and installed it into its chosen
location of /usr/local/lib (which is listed in /etc/ld.so.conf), and
then went to build Pike. I can compile a sample piece of code similar
to the one in src/configure, but 'make reconfigure' closes with
"WARNING: Gjalloc could not be found, thats bad!", and running
bin/pike gives an opening message "NO PIKE_NEW_BLOCK_ALLOC".

This is using Pike on branch arne/block_alloc, and GJAlloc on branch master.

Attempting a 'make clean' and 'make' didn't change anything, neither
did re-cloning (from a local repository clone).

Any advice would be greatly appreciated.

Thanks!

Chris Angelico

Arne Goedeke

unread,

Jun 16, 2012, 4:49:01 AM6/16/12

to Chris Angelico, Pike mailinglist

You can currently build it with pike using bundles. It should be enough
to place a tarball into bundles/ in your pike source dir. Use

make dist-gzip

to make one. I am working on the allocator again since a few days. The current
state of the block_alloc branch did not work with the current
github branch of gjalloc. I pushed those changes that should make it
work again.

Note that pike uses the gjallo.c file directly from
src/block_allocator.c currently to emulate the old macro-api. Hence,
currently, I think having it as a bundle is necessary to get it running.

Sorry for the mess! ,)

arne

ps. there is some bugs right still, i hope you dont run into them. You
can also check out an older version of gjalloc like
cd9dc5e45ddaeec96f7e64f9410dec905ae2913f

Arne Goedeke

unread,

Jun 16, 2012, 5:15:14 AM6/16/12

to Pike mailinglist

When updating the bundle, I usually remove the build/ directory
and rebuild completely. Is it possible to make the build process realize
that a bundle has been changed, i.e. the tarball has been updated? I am
doing this on an atom, so rebuilds take for ever...

Chris Angelico

unread,

Jun 16, 2012, 5:26:30 AM6/16/12

to Pike mailinglist

On Sat, Jun 16, 2012 at 6:49 PM, Arne Goedeke <e...@laramies.com> wrote:
> You can currently build it with pike using bundles. It should be enough
> to place a tarball into bundles/ in your pike source dir. Use
>
> make dist-gzip
>
> to make one. I am working on the allocator again since a few days. The
> current
> state of the block_alloc branch did not work with the current
> github branch of gjalloc. I pushed those changes that should make it
> work again.

Okay, trying that. The .tar.gz built no problems; I copied it into
bundles/ with no visible effect - do I have to register it with the
build in some way?

ChrisA

Arne Goedeke

unread,

Jun 16, 2012, 5:32:00 AM6/16/12

to Chris Angelico, Pike mailinglist

On Sat, 16 Jun 2012, Chris Angelico wrote:

I usually have to build twice until it works. As I said, I remove build/
and then rebuild once from scratch (and then one more 'make all').
I dont have a good understanding of the build system. I remember
trying to tweak it to build properly on the first go. Maybe someone
else can help with this...

Chris Angelico

unread,

Jun 16, 2012, 8:06:23 AM6/16/12

to Pike mailinglist

On Sat, Jun 16, 2012 at 7:32 PM, Arne Goedeke <e...@laramies.com> wrote:
>
> I usually have to build twice until it works. As I said, I remove build/
> and then rebuild once from scratch (and then one more 'make all').
> I dont have a good understanding of the build system. I remember
> trying to tweak it to build properly on the first go. Maybe someone
> else can help with this...

Cloning fresh and starting over.

Clone. Checkout branch. Copy .tar.gz from ../GJAlloc/ to bundles/.
Make once. (Test Pike. It segfaults.) Remove build/ subtree. Make
again. Test Pike (it segfaults again). Wipe out build/ and make
reconfigure and make. Still segfaulting.

Do I need to destroy bin/ as well as build/ ? Trying that now.

ChrisA

Arne Goedeke

unread,

Jun 16, 2012, 12:25:05 PM6/16/12

to Chris Angelico, Pike mailinglist

Its maybe simply too messy right now. If you have still energy to test
this stuff, please try running older versions (like before the half
finished changes I did this week) of both gjalloc and the block_alloc
branch. Otherwise you can wait until I have sorted these things out.
Some parts are in transition right now. Sorry again, not trying to
waste your time with this..

arne

On Sat, 16 Jun 2012, Chris Angelico wrote:

Chris Angelico

unread,

Jun 16, 2012, 7:42:40 PM6/16/12

to Pike mailinglist

On Sun, Jun 17, 2012 at 2:25 AM, Arne Goedeke <e...@laramies.com> wrote:
> Its maybe simply too messy right now. If you have still energy to test
> this stuff, please try running older versions (like before the half
> finished changes I did this week) of both gjalloc and the block_alloc
> branch. Otherwise you can wait until I have sorted these things out.
> Some parts are in transition right now. Sorry again, not trying to
> waste your time with this..

I'm happy to tinker! It's what I do :)

Currently attempting to go back as far as April, as I'm not certain
how the two repositories sync up otherwise. GJAlloc commit 11190f3 and
Pike commit 56f88e7 (this one being immediately before the merge of
7.9). Is my step sequence correct? (Wiping out bin/ was a stupid idea
- some of it is in repository. Wiping it and checking it out from repo
didn't help either, though.)

That eliminates the segfaulting, but it's still not using the new
allocator (same message from configure and same message on startup).

I'd like to try this, as it'd be awesomely cool to be able to run
multiple threads on my multi-core system. Currently only one thread at
a time seems to be able to run, when I do something trivial like:

void spin() {int i=0; while (1) ++i;}

Incidentally, the even more trivial:

void spin() {while (1);}

doesn't even use CPU time. Pike is so awesome that it can optimize
infinite loops! :)

Am I right in thinking that the reason the spinners can't parallelize
is because they're all locking on updating i, and that your new
allocator would change that? Or am I conflating two proposed changes?
I seem to recall there being something about separating local objects
from global ones, a while ago.

Chris Angelico

Arne Goedeke

unread,

Jun 17, 2012, 3:38:03 AM6/17/12

to Chris Angelico, Pike mailinglist

Its not related to multithreading. The block allocator in pike is used
to dynamically allocate fixed size objects. This happens internally in
many places. The advantage of having a special allocator for that is
lower memory fragmentation and usually faster allocation than using
system malloc. This so called 'new block allocator' aims to be faster
than the currently used one, in particular when having many allocations.
It should not influence program execution, i.e. sorry for the segfault ,)

arne

Chris Angelico

unread,

Jun 17, 2012, 3:50:16 AM6/17/12

to Pike mailinglist

On Sun, Jun 17, 2012 at 5:38 PM, Arne Goedeke <e...@laramies.com> wrote:
> Its not related to multithreading. The block allocator in pike is used
> to dynamically allocate fixed size objects. This happens internally in
> many places. The advantage of having a special allocator for that is
> lower memory fragmentation and usually faster allocation than using
> system malloc. This so called 'new block allocator' aims to be faster
> than the currently used one, in particular when having many allocations.
> It should not influence program execution, i.e. sorry for the segfault ,)

Ah! My bad. Oh well.

So where _was_ that alternative allocator proposal? I seem to remember
it being linked to on this list, but I can't find it anywhere. It had
stuff about different pools of objects and different lockings so that
code that's manipulating only objects that can be accessed only from
that thread (including the common case of stack-only work) wouldn't
need to lock against other threads. It was an incredibly cool
proposal, but now I can't find it back :(

ChrisA

Arne Goedeke

unread,

Jun 17, 2012, 4:03:09 AM6/17/12

to Chris Angelico, Pike mailinglist

On Sun, 17 Jun 2012, Chris Angelico wrote:

Martin Stjernholm laid out a plan of what would be necessary to
accomplish this. Its probably what you are referring to. There is a
branch called mast/multi-cpu with a .txt file with a lot of details.
There is also a define which activates more fine grained locking, but I
dont know if that still works. It was written before I worked on pike,
and I think it was abandoned because all those locks are too expensive
performance wise to make it worthwhile.

arne

Chris Angelico

unread,

Jun 17, 2012, 4:12:21 AM6/17/12

to Pike mailinglist

On Sun, Jun 17, 2012 at 6:03 PM, Arne Goedeke <e...@laramies.com> wrote:
> Martin Stjernholm laid out a plan of what would be necessary to
> accomplish this. Its probably what you are referring to. There is a
> branch called mast/multi-cpu with a .txt file with a lot of details.
> There is also a define which activates more fine grained locking, but I
> dont know if that still works. It was written before I worked on pike,
> and I think it was abandoned because all those locks are too expensive
> performance wise to make it worthwhile.

Ah yes! That's the one I was looking for. Thanks!

I suspect that this level of language development is probably beyond
my skills as a programmer, but it's something that interests me
(though I don't actually have a major use-case), so when it comes to
testing and tweaking, I'll be happy to help out.

ChrisA

Per Hedbor () @ Pike (-) importm�te f�r mailinglistan

unread,

Jun 17, 2012, 9:45:02 AM6/17/12

to pi...@roxen.com

> I suspect that this level of language development is probably beyond
> my skills as a programmer, but it's something that interests me
> (though I don't actually have a major use-case), so when it comes to
> testing and tweaking, I'll be happy to help out.

In general, if you want more performance multiple un-connected
threads, processes or whatever is usually very much preferable to
shared-memory threads (like pike has).

Share-nothing message passing systems is also the only thing that
scales to non-trivial systems (like, say, Opera Mini with a few
thousand nodes, and a few hundred thousand hw-threads).

Locks can never ever give enough performance. You even have to be
very careful about things like having hte code in the wrong cache
(bigger systems are ususally NUMA nodes)

--
Per

Martin Stjernholm

unread,

Jun 17, 2012, 3:43:00 PM6/17/12

to pi...@roxen.com

"Per Hedbor () @ Pike (-) importmöte för mailinglistan" <63...@lyskom.lysator.liu.se> skrev:

> In general, if you want more performance multiple un-connected
> threads, processes or whatever is usually very much preferable to
> shared-memory threads (like pike has).
>
> Share-nothing message passing systems is also the only thing that
> scales to non-trivial systems (like, say, Opera Mini with a few
> thousand nodes, and a few hundred thousand hw-threads).

The multi-cpu proposal aims to optimize mainly the use case with a large
amount of shared read-often-change-seldom data. Separate processes
without shared memory surely scales better per se, but they do not
address that problem.

Efficient multi-cpu computing, or in extension scalable computing across
cpus/nodes/clusters on any level, needs both this, message passing, and
other tools with different granularity. It's not a problem that is
solved once and for all with a single technique - it needs to be solved
on every level.

Anyway, in most cases one can stop short of the message passing level,
which changes the design drastically, so it's very worthwhile to build
good tools with finer granularity.

> Locks can never ever give enough performance. /.../

True, shared data structures need to be lock free. There's really no
reason to not make hash tables lock free, and there's actually some hope
of making binary trees lock free as well.

Martin Bähr

unread,

Jun 17, 2012, 9:48:03 PM6/17/12

to Per Hedbor () @ Pike (-) importm�te f�, pi...@roxen.com

On Sun, Jun 17, 2012 at 01:45:02PM +0000, Per Hedbor () @ Pike (-) importm�te f�r mailinglistan wrote:
> In general, if you want more performance multiple un-connected
> threads, processes or whatever is usually very much preferable to
> shared-memory threads (like pike has).
>
> Share-nothing message passing systems is also the only thing that
> scales to non-trivial systems

is this what erlang is designed to do?

> (like, say, Opera Mini with a few
> thousand nodes, and a few hundred thousand hw-threads).

do you have any framework or modules that helps to build such systems?

greetings, martin?
--
cooperative communication with sTeam - caudium, pike, roxen and unix
services: debugging, programming, training, linux sysadmin, web development
--
pike programmer working in china societyserver.(org|net)
foresight developer community.gotpike.org foresightlinux.org
unix sysadmin (open-steam|www.caudium).org realss.com
Martin Bähr http://societyserver.org/mbaehr/

Stephen R. van den Berg

unread,

Jun 18, 2012, 9:48:29 AM6/18/12

to Chris Angelico, Pike mailinglist

Chris Angelico wrote:
>Clone. Checkout branch. Copy .tar.gz from ../GJAlloc/ to bundles/.
>Make once. (Test Pike. It segfaults.) Remove build/ subtree. Make
>again. Test Pike (it segfaults again). Wipe out build/ and make
>reconfigure and make. Still segfaulting.

I had Pike 7.9 segfaulting quite consistently after the new machine code
optimisations came in (it faulted when started interactively, it ran without
problems when running a program-script).
--
Stephen.

"Please, somebody give him a blowjob so we can impeach him."

Henrik Grubbström

unread,

Jun 18, 2012, 10:00:18 AM6/18/12

to Stephen R. van den Berg, Chris Angelico, Pike mailinglist

On Mon, 18 Jun 2012, Stephen R. van den Berg wrote:

> Chris Angelico wrote:
>> Clone. Checkout branch. Copy .tar.gz from ../GJAlloc/ to bundles/.
>> Make once. (Test Pike. It segfaults.) Remove build/ subtree. Make
>> again. Test Pike (it segfaults again). Wipe out build/ and make
>> reconfigure and make. Still segfaulting.
>
> I had Pike 7.9 segfaulting quite consistently after the new machine code
> optimisations came in (it faulted when started interactively, it ran without
> problems when running a program-script).

Yes, I wouldn't call Pike 7.9 stable at the moment...
I and Per have introduced (and fixed) several bugs in the code-generator
for amd64. I believe that all of the known bugs have been fixed now, but...

--
Henrik Grubbstr�m gru...@roxen.com
Roxen Internet Software AB

Chris Angelico

unread,

Jun 18, 2012, 10:51:55 AM6/18/12

to Pike mailinglist

On which platform? I've had no trouble under Linux. The only segfaults
have been while playing with the other allocator, and even there, it's
possible to build it so it doesn't.

I can crash 7.8.352 for Windows, but not 7.9.anything for Linux.

ChrisA

Henrik Grubbström

unread,

Jun 18, 2012, 11:00:51 AM6/18/12

to Chris Angelico, Pike mailinglist

On Tue, 19 Jun 2012, Chris Angelico wrote:

> On Tue, Jun 19, 2012 at 12:00 AM, Henrik Grubbstr�m <gru...@roxen.com> wrote:
>> On Mon, 18 Jun 2012, Stephen R. van den Berg wrote:
>>
>>> Chris Angelico wrote:
>>>>
>>>> Clone. Checkout branch. Copy .tar.gz from ../GJAlloc/ to bundles/.
>>>> Make once. (Test Pike. It segfaults.) Remove build/ subtree. Make
>>>> again. Test Pike (it segfaults again). Wipe out build/ and make
>>>> reconfigure and make. Still segfaulting.
>>>
>>>
>>> I had Pike 7.9 segfaulting quite consistently after the new machine code
>>> optimisations came in (it faulted when started interactively, it ran
>>> without
>>> problems when running a program-script).
>>
>>
>> Yes, I wouldn't call Pike 7.9 stable at the moment...
>> I and Per have introduced (and fixed) several bugs in the code-generator for
>> amd64. I believe that all of the known bugs have been fixed now, but...
>
> On which platform? I've had no trouble under Linux. The only segfaults
> have been while playing with the other allocator, and even there, it's
> possible to build it so it doesn't.

Amd64 (aka x86_64), and this morning (CET) there was a few bugs in the
peep-hole optimizer that could break certain code.