Hello All,
Some weeks ago I mentioned that I was going to study Nope web server , and having done that I would explain its architecture and inner workings. This is just for fun and also to learn. It is pure C language thing but extremely simple and easy to follow. It has small code base and extremely concise.
I would like to state that I have not written a real web server before therefore I am not an expert at all. However, any error or mistake in this article is mine and I humbly accept my weaknesses. If you see any error, or poorly stated facts, please don't hesitate to point it out. I would also add that I am not a C language expert, however I do know enough to be able to interpret and understand medium sized code base. I spend my time these days going through C language and thinking the art.
Nope web server is just a basic stuff, no fanciful protocols,websocket, SPDY, and co. Just the HTTP thingy, and pretty no configuration file and other setup. I really doubt if one would every use it in a production environment but it has great stuff for learning.
HTTP web server could be viewed as a machine that is into clearing and forwarding. It takes request from client, and work on it and sends respond back to client (web browser). And wait for next request, just like that. I hear you say pretty simple. Yes, you are right! In order to achieve this we have to have a way to talk to ethernet driver . This is where socket apis come in. We use socket apis to set up our server, it is nothing but template-like stuff, a mere sequence of activities----init, bind, listen, and accept. Once you have a proper socket setup, then you have something to show off and play with. You have a server, a machine to serve static web pages. But there is something lacking , something we all value so much---speed. Your machine would just be doing one thing at a time, and what if that data coming from client hangs? What happens if there is sudden error/exception? I guess you would like to be doing another stuff while waiting for client data to come in or finish. Here comes the non blocking I\O, so you have to make everything non blocking. Hmm, now you have speed . Awesome!
It is not yet a done deal, what happens when many people connects to your server at same time. I heard you say that they should queue up as it happens in Nigerian banks' cash counters. But wait even bank counter has multiple teller points. Why can't we make our server have such. Let's go, we have two options--thread and process. Thread is indeed an awesome abstraction that "turns" your server into multitasking machine. However it is not always easy to manage thread because of shared state. The best bet for your is process, though it would cost you memory, but it would not lead you to age fast. Starting a process is just like filling a template. Just as simple as ABCD! But because memory is finite it pays to put limit to the number of processes you need.
Now your server is indeed hot you can show it off. Maybe it would secure a plum job for you at #$>% online shopping company. But before calling recruiters, do you still remember that when we queue up at teller point, we are served one person at a time. So let's include that to our machine, as usual we have different ways of doing this, event readiness. We have epoll(poll) and select to play with now. Nope used the two but orthogonal.
Let's look at however we can serve pages, we need a way to understand first the message from client. Nope being C language thingy uses basic state machine (regular expression) is parsing client data. After parsing client data, we have to build a data structure for the data so that lookup and other stuff would be pretty simple. A simple C struct (record) would do here, and that is what Nope has.
Now we have client request in , we have to do what I will call find and match. You have already setup different things your server would do like fetching a file, getting data from database, and manipulating data. So , which of these functions you have already setup is the client interested in? You can find this out by checking your request uri and matching it against your functions, if there is a match then call that function. Some pretty smart folks call this routing or routing table.
After calling the matched function, you are expected to furnish the client with response. And this being HTTP environment, you have to setup your response properly. It must follow some rules. Wow! You now have a great server, machine that can do some awesome stuff. However, some optimization might be needed. If you are on Unix platform, maybe
you should use socketpair to play with file. But this means going back to thread, locking and unlocking. Waiting and signaling, some funny stuff.
I have just walked through what Nope is and also the basic architecture used in most Web servers. In my next articles I will through functions and explain in-depth their "functions" .