A security researcher decided to purposely take down sage.math to
demonstrate that it is possible to fork bomb the machine through the
public sage notebook servers. I had always plan to run these comletley
public servers until something like this happened. Therefore,
sagenb.org (and the other public sage notebook servers I host) will be
completely disable until further notice.
I might re-enable them in the future if I set them up from scratch
using a vmware virtual machine and vmware server. Given that I've never
successfully configured vmware server on any Linux box, I don't know
when this will happen. If a Sage developer would like to attempt to do
this instead of me on sage.math please contact me, since this is not
currently my highest priority (especially, because I'm in France
traveling right now).
-- William
kcrisman,
This was discussed recently. Several people said that if you start
several Sage notebooks on the same machine or virtual machine, but
different ports, things can scale up. It's having too many people on
the same sage notebook that seems to be the problem. We aren't sure
what the bottleneck is; someone needs to do some profiling to find out
where it is.
Some people suggested something like 30 people per Sage notebook, but
you can experiment and find out how many people you can have log in and
actively use a single Sage notebook.
How much memory do you allocate to the virtual server? Are you sure
that all the memory is being used up?
Also, it seemed that at least one person was running Knoboo and it
seemed to scale fine, though that means you would give up interacts.
Thanks,
Jason
Yes, this is a shortterm final decision. I always planned to run the
public servers as is until some script kiddie $%%#$^% etc. I assume
Harald Schilly will fix the website.
Note that as I say above, this "final decision" really is "short term",
i.e., probably 2-3 weeks. vmware is perfect for this application.
William
Serge
That is not enough. Could you allocate, say... 4GB instead?
> Yes, swap space is completely full
> and main memory is very close to full. The machine does not halt -
> you can usually still log in. His point of view is that the current
> issue is not networking-related, but rather that the various notebooks
> opening up are creating so many processes that the system is
> overtaxed.
He is probably right.
>
> What I am wondering is if anyone knows how quickly a) multiple logins
> to a notebook might do that or b) interact processes might do that (do
> the objects get cached, for instance?) or c) people forgetting to log
> out and perhaps leaving a notebook running might do that
c) could easily. Did you set the timeout parameter for the server?
timeout -- (default: 0) seconds until idle worksheet sessions
automatically timeout, i.e., the corresponding
Sage session terminates. 0 means 'never timeout'.
Also, you can limit memory usage for individual projects.
> or d)
> something else I can't think of might do that. Sysadmin knows about
> VMs but not so much about internals of Sage, so he isn't sure if it's
> simply people logging in and then not logging out while the notebook
> is still active, or if it could be something else.
>
> I understand that to some extent there is a lot of uncertainty as to
> how efficiently the notebook works, but I know so little about how it
> works (and about interact) that I'm asking the stupid questions, in
> case one of them turns out to have part of the answer.
They are not dumb, and it is very interesting thinking about a
concrete example of the notebook being used in a constrained environment.
William
--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org
Okay, great.
>
>> How much memory do you allocate to the virtual server? Are you sure
>> that all the memory is being used up?
>>
>
> My sysadmin says probably 512 MB. Yes, swap space is completely full
> and main memory is very close to full. The machine does not halt -
> you can usually still log in. His point of view is that the current
> issue is not networking-related, but rather that the various notebooks
> opening up are creating so many processes that the system is
> overtaxed.
>
> What I am wondering is if anyone knows how quickly a) multiple logins
> to a notebook might do that or b) interact processes might do that (do
> the objects get cached, for instance?) or c) people forgetting to log
> out and perhaps leaving a notebook running might do that or d)
> something else I can't think of might do that. Sysadmin knows about
> VMs but not so much about internals of Sage, so he isn't sure if it's
> simply people logging in and then not logging out while the notebook
> is still active, or if it could be something else.
There is a timeout parameter for the notebook function that is supposed
to automatically kill idle processes, I believe. That might alleviate
problems stemming from (c).
Jason
Browse around in
http://sage.math.washington.edu/home/was/sagenb/
and grab the relevant worksheet.txt file.
E.g., this directory has some of your worksheets:
http://sage.math.washington.edu/home/was/sagenb/nb2/sage_notebook/worksheets/pong/
You can then paste the worksheet.txt content into the notebook
in edit mode.
I wonder if we should have something like sagenb.org,
but for which people have to request an account and provide
credentials, and agree not to purposely attack the system? I.e.,
like we have with the trac system? E.g., I would be happy
to give pong an account on such a public server, but I would
not give one to the person who crashed sage.math.
William
I think that would be a good idea. However, I think one of the greatest
contributions of the free server was that it gave people with no
incentive to install Sage or start using it (and no incentive to email
someone and provide credentials) a chance to use Sage. I think the
try-before-you-buy (or at least, before you invest time) is a very big
part of our marketing strategy, and really lets people know that we are
completely free and we're not kidding about being helpful and free.
I think not having a free server available for anyone to try is going to
hurt us (I think quite a bit) in the long run.
Here's one way to do things to satisfy this crowd, though. In another
project, the OpenSourceCMS, there are installations of a huge number of
web CMS systems (content management systems, like Drupal, Moodle,
Wordpress, etc.) They refresh their install (i.e., completely reset the
install) every 2 hours (I believe) and put a timer on their page that
shows a countdown until the two hours is up. Then all systems are wiped
clean and reset and the timer starts again. They give out admin and
user-level passwords to each CMS so people can try them out. Here's the
example page for Drupal:
http://www.opensourcecms.com/index.php?option=com_content&task=view&id=132&Itemid=1&catid=68
I used opensourcecms.com quite a bit when I was evaluating CMS systems
because it was a no-hassle, anonymous way of playing with systems. It
didn't bother me that it was reset every 2 hours; in fact, I liked it
because I didn't have to worry about someone else messing up the system
(I just waited a few hours until the next refresh). Can we do the same
thing for Sage? Run it in a virtual server (even the current VMWare
one would do), and every day, or every few hours, reset it. Curious
people can play around with Sage, and those that want more than a
temporary playground can then install it on their own or request an
account on William's server he mentions above.
Thanks,
Jason
Thanks for providing these!
Would it be easy to start the sage server again, with accounts disabled
(or passwords all changed to nonsense), so that people could access the
published worksheets? Before the initial server reset, there was a huge
library of rather nice worksheets. I can see people being depressed
that all of that work is gone in an instant (at least, people that
aren't subscribed here to know where you put the raw notebook files, or
people that are subscribed here that aren't comfortable mucking around
with the internal notebook directory structure).
Just yesterday at a seminar talk, I had someone ask me about the huge
number of published worksheets in the previous Sage notebook. That's
quite a resource that lots of people invested lots of time into that is
no longer easily available, but could be.
It would take probably 15 minutes to whip up a short python for loop
that iterates through everyone's account and resets their password to
gobbly-gook. Then just put the server online (disabling creation of
accounts) so that published worksheets are available. If William is
okay with this, I can download the notebook in the next few days and
reset passwords so that it could be served in this read-only fashion.
Jason
It seems this would limit the problem but on the other hand this problem could
also have been triggered without 'security research' intent. Someone writes a
recursive program using fork() in e.g. Cython and fails to terminate the
recursion properly. All I'm saying is that this wouldn't ensure nothing bad
can happen.
Cheers,
Martin
--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_www: http://www.informatik.uni-bremen.de/~malb
_jab: martinr...@jabber.ccc.de
I like this idea, it also emphasises that this is a server to evaluate Sage
not a replacement of an installation of Sage for doing research and stuff.
>
> On Wednesday 15 October 2008, Jason Grout wrote:
>> I used opensourcecms.com quite a bit when I was evaluating CMS
>> systems
>> because it was a no-hassle, anonymous way of playing with
>> systems. It
>> didn't bother me that it was reset every 2 hours; in fact, I liked it
>> because I didn't have to worry about someone else messing up the
>> system
>> (I just waited a few hours until the next refresh). Can we do the
>> same
>> thing for Sage? Run it in a virtual server (even the current VMWare
>> one would do), and every day, or every few hours, reset it. Curious
>> people can play around with Sage, and those that want more than a
>> temporary playground can then install it on their own or request an
>> account on William's server he mentions above.
>
> I like this idea, it also emphasises that this is a server to
> evaluate Sage
> not a replacement of an installation of Sage for doing research and
> stuff.
I think this is a good idea too. Being able to say "just go to this
site and try it out" is very good for getting people to try it out
and play around with it, and the load will probably go down overall
if "serious" users are forced to actually get a copy.
I'd imagine with appropriate ulimits and multiple virtual servers,
something could be set up such that anything that accidentally (or
maliciously) happens on one server would only kill that one, and only
until it is freshly reset (say, a after a given number of hours).
- Robert
With virtual machines, this is easy: the server runs as a guest virtual
machine with a network port on the host forwarded to the guest. Every
two hours (or whatever), the host runs `kill -9' on the guest -- which
is about the same as pulling the plug -- and restarts the guest from a
snapshot. I'm reasonably certain this could be done from a cron job with
VirtualBox (which is what I use); I'm guessing the other virtualization
setups (KVM, Xen, VMWare) can do it too.
Dan
--
--- Dan Drake <dr...@mathsci.kaist.ac.kr>
----- KAIST Department of Mathematical Sciences
------- http://mathsci.kaist.ac.kr/~drake
> On Oct 16, 1:55 am, Dan Drake <dr...@mathsci.kaist.ac.kr> wrote:
>> runs `kill -9' on the guest -- which
>> is about the same as pulling the plug -- ...
>
> Uhm, since virtual box has a python scripting interface, a more humane
> reset to snapshot functionality could be possible :)
With a fallback to kill -9 just in case...
> But how are the
> accounts managed? If they are just deleted everytime and there is no
> datastore across all virtual box instances, there isn't much fun. Is
> it possible to mount a shared homedirectory with multiple virtual box
> instances? At least, mounting one external share r/w works ...
Yes, but of course this opens up vulnerabilities. Actually, if it's
being reset, is there a need to even create an account?
- Robert
Or you can use (with virtualbox):
VBoxManage snapshot discardcurrent -state
from the commandline (I think I have the parameters right).
And then fall back to kill -9 and running the command again, of course.
>> But how are the
>> accounts managed? If they are just deleted everytime and there is no
>> datastore across all virtual box instances, there isn't much fun. Is
>> it possible to mount a shared homedirectory with multiple virtual box
>> instances? At least, mounting one external share r/w works ...
>
> Yes, but of course this opens up vulnerabilities. Actually, if it's
> being reset, is there a need to even create an account?
>
Creating accounts makes a barrier to entry, but it also makes it so that
people don't mess with other people's worksheets (i.e., if I'm playing
with Sage, someone else won't come in and start changing my worksheet
around under me). So I vote for accounts.
Jason
> Creating accounts makes a barrier to entry, but it also makes it so
> that
> people don't mess with other people's worksheets (i.e., if I'm playing
> with Sage, someone else won't come in and start changing my worksheet
> around under me). So I vote for accounts.
Good point. Maybe it takes an account to publish a worksheet, but if
I just come to the site I can start trying it out right away with a
blank worksheet and a randomly-generated one-time account (i.e. it
would never take you to the "worksheets list"
- Robert
Good point as well. Yes, I think that would work out really nicely.
I click a link on the sage website, and am immediately presented with a
worksheet that I can start typing stuff into.
Jason