sage 2.10.2: 18575 Alarm clock

15 views
Skip to first unread message

Ondrej Certik

unread,
Feb 25, 2008, 11:24:06 PM2/25/08
to sage-s...@googlegroups.com
Hi,

I have 64bit Debian and I did:

$ wget http://sagemath.org/SAGEbin/linux/64bit/sage-2.10.2-debian-64bit-intel-x86_64-Linux.tar.gz
$ tar xzf sage-2.10.2-debian-64bit-intel-x86_64-Linux.tar.gz
$ ln -s sage-2.10.2-debian-64bit-intel-x86_64-Linux/ sage
$ cd sage
$ ./sage
----------------------------------------------------------------------
| SAGE Version 2.10.2, Release Date: 2008-02-22 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
The SAGE install tree may have moved.
Regenerating Python.pyo and .pyc files that hardcode the install PATH
(please wait less than a minute)...
Please do not interrupt this.
/home/ondra/ext/sage/local/bin/sage-sage: line 149: 18575 Alarm clock
"$SAGE_ROOT/local/bin/"sage-location
Setting permissions of DOT_SAGE directory so only you can read and write it.

sage:
Exiting SAGE (CPU time 0m0.00s, Wall time 2m2.75s).
Exiting spawned Gap process.

Notice the "Alarm clock" line. What does that mean?

Ondrej

mabshoff

unread,
Feb 25, 2008, 11:39:01 PM2/25/08
to sage-support


On Feb 26, 5:24 am, "Ondrej Certik" <ond...@certik.cz> wrote:
> Hi,
>
> I have 64bit Debian and I did:
>
> $ wgethttp://sagemath.org/SAGEbin/linux/64bit/sage-2.10.2-debian-64bit-inte...
> $ tar xzf sage-2.10.2-debian-64bit-intel-x86_64-Linux.tar.gz
> $ ln -s sage-2.10.2-debian-64bit-intel-x86_64-Linux/ sage
> $ cd sage
> $ ./sage
> ----------------------------------------------------------------------
> | SAGE Version 2.10.2, Release Date: 2008-02-22 |
> | Type notebook() for the GUI, and license() for information. |
> ----------------------------------------------------------------------
> The SAGE install tree may have moved.
> Regenerating Python.pyo and .pyc files that hardcode the install PATH
> (please wait less than a minute)...
> Please do not interrupt this.
> /home/ondra/ext/sage/local/bin/sage-sage: line 149: 18575 Alarm clock
> "$SAGE_ROOT/local/bin/"sage-location

I just grepped through my tree and there is no "Alarm clock" string
anywhere. I also checked the binary you tried and there is no string
in there either. The log in SAGE_LOCAL/bin doesn't indicate any
commits that I don't have and the repo is clean, i.e. no outstanding
changes.

So: can anybody else reproduce this?

> Setting permissions of DOT_SAGE directory so only you can read and write it.
>
> sage:
> Exiting SAGE (CPU time 0m0.00s, Wall time 2m2.75s).
> Exiting spawned Gap process.
>
> Notice the "Alarm clock" line. What does that mean?
>
> Ondrej

Cheers,

Michael

William Stein

unread,
Feb 25, 2008, 11:43:48 PM2/25/08
to sage-s...@googlegroups.com

This is in sage-location:

import signal
signal.alarm(360)

This is *stupid*. It's doing a process that really really better not
get interrupted, and it's
interrupting itself if it takes more than 6 minutes. I can't imagine
what idiot wrote such
bad code....

... oh wait, that was me. I think that was in there right when that
code was first written,
mainly for debugging purposes when I was doing some testing. Anyway... didn't
David Harvey say something in irc about all my old code being
"impatient" -- I guess
this is a prime example of that!

Anyway, fix up at trac #2311:
http://trac.sagemath.org/sage_trac/ticket/2311

>
>
> > Setting permissions of DOT_SAGE directory so only you can read and write it.
> >
> > sage:
> > Exiting SAGE (CPU time 0m0.00s, Wall time 2m2.75s).
> > Exiting spawned Gap process.
> >
> > Notice the "Alarm clock" line. What does that mean?
> >
> > Ondrej
>
> Cheers,
>
> Michael
>
>
> >
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

mabshoff

unread,
Feb 25, 2008, 11:48:33 PM2/25/08
to sage-support
Come again: On Ondrej's box it take more than *6 minutes* to run that
code? Somthing is seriously FUBAR on his box then. He did mention it a
while back in IRC but I cannot believe it is still a problem and he
didn't fix it :)

> ... oh wait, that was me. I think that was in there right when that
> code was first written,
> mainly for debugging purposes when I was doing some testing. Anyway... didn't
> David Harvey say something in irc about all my old code being
> "impatient" -- I guess
> this is a prime example of that!
>
> Anyway, fix up at trac #2311:
> http://trac.sagemath.org/sage_trac/ticket/2311
>

Cheers,

Michael

William Stein

unread,
Feb 25, 2008, 11:49:43 PM2/25/08
to sage-s...@googlegroups.com

Evidently. sage-location opens and closes a *lot* of files, so on a
slot computer
it could take a long time. Maybe he's using NFS over 10Mbps ethernet? :-)

-- William

mabshoff

unread,
Feb 25, 2008, 11:52:38 PM2/25/08
to sage-support
<SNIP>

Hi,

> > Come again: On Ondrej's box it take more than *6 minutes* to run that
> > code? Somthing is seriously FUBAR on his box then. He did mention it a
> > while back in IRC but I cannot believe it is still a problem and he
> > didn't fix it :)
>
> Evidently. sage-location opens and closes a *lot* of files, so on a
> slot computer
> it could take a long time. Maybe he's using NFS over 10Mbps ethernet? :-)

He is using XFS locally IIRC, which boggles the mind.

> -- William

Cheers,

Michael

<SNIP>

Justin C. Walker

unread,
Feb 26, 2008, 12:50:30 AM2/26/08
to sage-s...@googlegroups.com

On Feb 25, 2008, at 20:39 , mabshoff wrote:
> On Feb 26, 5:24 am, "Ondrej Certik" <ond...@certik.cz> wrote:
>> Hi,
>>
>> I have 64bit Debian and I did:
[snip]

>> /home/ondra/ext/sage/local/bin/sage-sage: line 149: 18575 Alarm clock
>> "$SAGE_ROOT/local/bin/"sage-location
>
> I just grepped through my tree and there is no "Alarm clock" string
> anywhere. I also checked the binary you tried and there is no string
> in there either. The log in SAGE_LOCAL/bin doesn't indicate any
> commits that I don't have and the repo is clean, i.e. no outstanding
> changes.

T'ain't in the Sage source; it's in libc (it's been there since I was
a lad :-}). It means that someone called 'alarm(3)', and it sprung.

Justin

--
Justin C. Walker, Curmudgeon-at-Large
() The ASCII Ribbon Campaign
/\ Help Cure HTML Email

Ondrej Certik

unread,
Feb 26, 2008, 5:56:43 AM2/26/08
to sage-s...@googlegroups.com
> > This is *stupid*. It's doing a process that really really better not
> > get interrupted, and it's
> > interrupting itself if it takes more than 6 minutes. I can't imagine
> > what idiot wrote such
> > bad code....
>
> Come again: On Ondrej's box it take more than *6 minutes* to run that
> code? Somthing is seriously FUBAR on his box then. He did mention it a
> while back in IRC but I cannot believe it is still a problem and he
> didn't fix it :)

Hey Michael - let's make a bet. I'll create an ssh account for you on
my box (well maintained Debian amd64 unstable running on Intel Quad
Core, it for example compiles the latest kernel in 10 minutes). You'll
investigate. If it's a stupidly misconfigured Debian on my side,
I'll write to my blog, that I am lame and I still have a lot of things
to learn from Michael. However, if it turns it's a bug in Sage,
you'll write you your blog, that you are lame. :) Which is a very
fair offer, because my blog is synced to planet.debian.org,
while your blog is only synced to planet.sagemath.org. :)

Ondrej

Ondrej Certik

unread,
Feb 26, 2008, 8:23:20 AM2/26/08
to sage-s...@googlegroups.com

Well, I think I know where the problem is. XFS sucks. Badly.

I tried it on ext3, and it's like 15 seconds. I did experiment with
different kernels, and the
latest kernel that work reasonably fast with XFS is 2.6.16 or 2.6.17
and that's because of the
nobarrier option, which is used like this:

/dev/sda3 /home xfs defaults,nobarrier 0 2

This was default in older kernels. I have the nobarrier option on my
laptop with 2.6.24 and it is indeed faster, but not that much (like 3
minutes or something). I know I was discussing this with Michael on
IRC regardig unpacking a Debian base image, when I am testing Debian
packages.
It's super fast on ext3, and a lot slower on XFS with nobarrier and
extremely slow on XFS without nobarrier, which is the default on
modern kernels.

Here are some pages I did half a year ago regarding this issue:

http://wiki.debian.org/cowbuilder_benchmark

I think I'll have to blog about this to get it more publicity, because
this is a show stopper for XFS - it's 20x slower than ext3 by default.

Ondrej

Ondrej Certik

unread,
Feb 26, 2008, 8:55:42 AM2/26/08
to sage-s...@googlegroups.com

Also this message gives some background about this problem:

http://lkml.org/lkml/2006/5/22/278

They say it safer to enable barrier support.

Actually, it's not that bad for XFS (yet). This command fixes the
reported problem:

mount -o remount,rw,nobarrier /dev/sda3 /home/

Now the first "./sage" run takes 15s on that computer. Unfortunately,
on my laptop with nobarrier option, it still takes 5 minutes, don't
know why. "dstat" on my laptop shows that the write speed oscillates
between 15MB/s and 1MB/s, more the latter than the former (if it was
15MB/s all the time, all would be ok). Unfortunately, the only spare
partition on my laptop is the 2GB swap (yes I am lame indeed, I should
have created more), but I just reformatted it to ext3, tried "./sage"
on it and: 26s. Well, I am moving from xfs to ext3 on my laptop.

Anyway, I'll turn this into the blog. It's defenitely not a bug in
Sage. If it's a serious misconfiguratin on my side, that I am using
default settings with XFS, well, that's discutable too. :)

Ondrej

mabshoff

unread,
Feb 26, 2008, 9:44:14 AM2/26/08
to sage-support


On Feb 26, 2:55 pm, "Ondrej Certik" <ond...@certik.cz> wrote:
> On Tue, Feb 26, 2008 at 2:23 PM, Ondrej Certik <ond...@certik.cz> wrote:
>
> > On Tue, Feb 26, 2008 at 11:56 AM, Ondrej Certik <ond...@certik.cz> wrote:
> > > > > This is *stupid*. It's doing a process that really really better not
> > > > > get interrupted, and it's
> > > > > interrupting itself if it takes more than 6 minutes. I can't imagine
> > > > > what idiot wrote such
> > > > > bad code....
>
> > > > Come again: On Ondrej's box it take more than *6 minutes* to run that
> > > > code? Somthing is seriously FUBAR on his box then. He did mention it a
> > > > while back in IRC but I cannot believe it is still a problem and he
> > > > didn't fix it :)
>
> > > Hey Michael - let's make a bet. I'll create an ssh account for you on
> > > my box (well maintained Debian amd64 unstable running on Intel Quad
> > > Core, it for example compiles the latest kernel in 10 minutes). You'll
> > > investigate. If it's a stupidly misconfigured Debian on my side,
> > > I'll write to my blog, that I am lame and I still have a lot of things
> > > to learn from Michael. However, if it turns it's a bug in Sage,
> > > you'll write you your blog, that you are lame. :) Which is a very
> > > fair offer, because my blog is synced to planet.debian.org,
> > > while your blog is only synced to planet.sagemath.org. :)
>
> > Well, I think I know where the problem is. XFS sucks. Badly.

Hi Ondrej,

didn't I allude to the possibility that it was XFS related in your
case. And by the way: No fair that you issue a challenge while I am
asleep and then go on and solve it :).

In the end it is cool that you tracked down the problem and in the
immortal words of Stephen Colbert: I accept your apology.
Something is certainly FUBAR on XFS with your setups and/or the Debian
defaults. Either way, have fun fixing it and we can look at it (if the
problem still exists) at SD8 in two days :)

> Ondrej

Cheers,

Michael
Reply all
Reply to author
Forward
0 new messages