Would an Intel machine with lots of RAM, running Linux/apache
be enough for this task? Or should I go for a workstation?
/mikael
--
My first instinct would be to use 2 or 3 cheaper Linux boxes with less ram
and DNS load balancing.
Mirror the data from all the websites across all the machines. Hard
drives are cheap enough that you shouldn't bother with NFS. You gain the
benefit of better access speed since the data is local and you don't end
up with a situation where one machine crashing can take down your whole
system. It'll also reduce your local network traffic. When you need more
capacity, you just plug another box into your network, run mirror & pretty
soon you have X percent more capacity with minimal hassle. You may still
want to have one special machine for storing all pages that update
databases, and one (probably the same one), but mirror this machine's data
to the whole cluster as well.
IMO, the big advantage to redundant arrays of inexpensive computers is
that if one goes down, it's bad, but not catastrophic, especially if all
the machines mirror all the data - even if your main cgi machine craps
out, you just tinker with your primary DNS server, restart named and
you're ready to rock and roll again, if a little slower.
Parts for PCs also run much cheaper in my (very limited) experience.
The number of customers has nothing to do with the size of machine,
it's the number of hit which counts. I have seen some results which
seem to indicate that a P5-100 running Slack1.2.13 and NCSA1.5a will
support 5-7k hits/hr (~2/sec) without undue strain. Either NCSA or
Apache work well (I had problems compiling Apache but didn't chase
them), so a faster machine obviously could support more. The box I
mention has 32m, you can use more, increase the prefork limit, put
the content on multiple drives, etc.
I think you will find that disk rather than CPU is the common limit,
and more memory and drives help that. I hope to try the md driver in
a web application, but probably not until May or so.
You should really look at projected use, not customers. Some will
generate few hits, while others will bog you right down. Look at CGI
use, and charge accordingly. Disk is cheap, bandwidth expensive. CPU
is relatively cheap, maybe a P5-166 with burst cache (~$1200) to
start. You can probably get a dual CPU board for less than double
that, and the 2xP6-100 board was only about $5k. By the time you ned
it you can afford it.
Note: Triton chipset board are fast and cheap, but don't support
parity. This is a server, this is your business, SIS chipset boards
have parity and are within a few % as fast. make your own decision,
but be aware there is a decision to be made. Prices above are for
parity boards, and the single processor board has EIDE and 2x16550+P
onboard.
Good luck with your business.
--
-bill davidsen (davi...@tmr.com)
"As a software development model, Anarchy does not scale well."
-Dave Welch