I'm not sure I understand what you're asking. Yes, you should call
GQCS to get work to do. From time to time, check your list of
connections and see if any of them have had no activity for "too long"
and if any have, shut them down or handle them how you wish.
A pool of threads works best.
DS
> thank you for your answer. In Linux mono-threaded webservers are quite
> efficient.
The opposite has been my experience -- they're terrible. The most
common problem is that once client creates an unusual condition and
the code to handle that condition has to fault in because it's never
been loaded, and all the other clients are stalled while the page
fault is serviced. Yuck.
> With one thread you can handle the client connections and,
> you can also have an alarm which runs every second and checks whether
> some connection has been idle for a very long time. lighttpd works
> like this (I guess nginx does more or less the same). There is no need
> to have a thread-pool. I just wondered how this could be done in
> Windows.
You can do it exactly the same way, and it will work just as
mediocrely. You will lose out on most of the benefits of completion
ports, multi-core CPUs, and hardware that can do more than one thing
at a time, but it will work fine. Windows supports timers and it also
supports timeouts to GQCS,
DS
> I wrote a mono-threaded web server in Linux and didn't have any
> problem. It is a pain to handle the alarm because you have to record
> which connections have been interrupted so you resume them afterwards.
Are you sure you didn't have any problem? Or did you just accept that
servers are bursty? Or did you accept that you need to derate the
server because if you let the CPU get maxed, the server can fall
behind and never catch up? (Or is your load just so low that it
doesn't matter?)
> I want to use GQCS under Windows, so I thought that I would have to do
> it the same way as I did it in Linux: while I am waiting for events on
> the sockets, I am blocked and then I cannot check the idle
> connections. I saw the timer functions, but it seems I overlooked that
> I can specify a callback function. So it seems I can try to do it as I
> did it in Linux :). If the timer expires, does the thread really get
> out of the GetQueuedCompletionStatus function?
You can do it that way, but I wouldn't. GQCS takes a timeout. If you
insist on a single thread, just call GQCS with a timeout and
periodically scan the list of connections for any that have timed out.
DS
> I didn't have any problem with my Linux's web server. It had a very
> low load, but I tested with ab (Apache Benchmark) and the performance
> was more or less like nginx and lighttpd. I have it in
> freshmeat:http://unix.freshmeat.net/projects/schaefchenws
Sure, repeating the same operation over and over again will work
great. Where you run into trouble is when rarely-used code gets
invoked under moderate load, such as when one connection triggers an
error. You'll also get into trouble if one connection needs to read a
file, say from a slow NFS server, and you have no way to do that read
asynchronously.
> I will try the single threaded web server in Linux with GQCS and the
> timers. It might be that you are right. I don't know how this approach
> works in Windows but in Linux these kind of web servers seem to be
> quite efficient. Let's see. I have never used GQCS so I will like to
> learn :).
They're really not. They get very bursty under load, can't handle page
faults without stalling, and god forbid they ever need to read a file
from a slow disk.
DS
> And, which approach would you use? a thread-pool in which one thread
> handles a connection at a time or a thread-pool in which each thread
> might handle several connections at a time? I also implemented a
> webserver using the first approach but not the second. The problem was
> that, under heavy load, it ran out of threads.
If you can run out of threads under heavy loads, then you'd be
completely and utterly screwed if you only had one thread!
I would use a thread pool where every time I needed to do something,
I'd assign a thread from the pool to do it. The design would be very
much like your single-threaded server, except that I wouldn't be
totally and utterly screwed if I hit a page fault, read from a slow
NFS server, or otherwise got ambushed.
I'm not sure how you ran out of threads. Did you let the threads block
indefinitely even when there was work to do? Because that's *not* how
you use a thread pool. (The ideal situation is to only let a thread
block when there is no work to do. You can't achieve that perfection,
but you can get reasonably close.)
DS
> If I had only one thread, I could use asynchronous I/O (aio functions)
> to avoid to get blocked. I haven't used it in my web server but I know
> these functions are out there.
Yeah, AIO functions are an alternative to threads. They basically do
almost exactly the same thing except they could be lighter on some
platforms but are less flexible.
> I ran out of threads by doing many simultaneous web requests (using
> apache benchmark). I used asynchronous sockets, to avoid to get
> blocked while reading/writing from the sockets... but I could, as you
> said, get blocked while reading from disk. The threads got blocked
> only when there was nothing to do. When a new connection arrived, the
> listener thread put it in a queue with a condition variable in which
> there were many threads waiting for a new connection to come.
If you got blocked reading from disk, imagine how screwed you would
have been if you only had a single thread to read from the disk!
> I could have a small thread pool in which each thread handles several
> connections simultaneously, again, avoiding getting blocked while
> reading from disk.
The disk is only as fast as it is. At some point, the disk will be a
performance-limiting factor for some kinds of servers and there's
nothing you can do about it. Unless you use AIO or a similar
technique, you will need one thread for each operation you can
usefully pend to the disk. Typically, there is no point in pending
more than about four operations to the same disk device.
DS
> Yes, the hard disk is the final bottleneck. Network cards are getting
> faster and faster but the hard disks are not improving that much... it
> is true that you cannot have the hard disk serve any number of
> concurrent requests you want. When I wrote my web server, I didn't
> think of the number of concurrent operations on the hard disk, as I
> just thought it was fast enough; but you are right on that, if you
> need real speed, you also have to take this into account. Maybe the
> big web sites have their files distributed across several hard disks
> to increment the concurrency?
The secret is usually to have enough RAM that you can keep the "hot
files" in memory and you can load big enough chunks of the "cold
files" that the seek time doesn't hurt you. A single, decent modern
hard drive can read 70MB/s. So as long as you don't waste too much
time seeking, it can half fill a gigabit link even if it has to read
one byte for every byte it sends.
DS