Re: [AOLSERVER] aolserver vs lighthttpd, benchmarks

Tom Jackson

unread,

Oct 1, 2007, 12:40:19 PM10/1/07

to AOLS...@listserv.aol.com

Gustaf,

I'm going to ask that the patch be removed and replaced with a module. I've
already written one which does the same thing.

We are in a bad habit in this community of letting anything happen at the
least possible cost to those who modify core code. This is a perfect example
of the situation. We now have two new subcommands for ns_conn. Some would
phrase it as only two. But, for instance, most commands have like four total
subcommands.

Regardless of the number, we have just added two undocumented commands to the
core code. We have a major problem with documentation here, and we know why:
developers are not required (not even encouraged!) to do any work documenting
new code or features. They don't have to provide any examples which work with
AOLserver. All that they have to do is to commit a patch. This easy five
minute job now requires everyone else to do work.

Second, the patch does nothing for AOLserver without at least figuring out
where to get the ::thread package. Again, there is no documentation on where
to get it, how to install it or use it. But the example code which was
referenced below is part of a much larger project: OpenACS, and a very recent
version of it. In fact, it is part of OpenACS that isn't even necessary for
the core features, and is only there for optimization as you have pointed
out. The examples are written in xotcl, which I have tried to figure out for
a year or more with little success.

Third, some technical issues:

New features which don't require core changes should be done as modules,
unless there is an obvious close relationship to the core functions. In this
case, the new feature is so foreign to the normal workings of AOLserver that
the code should be isolated from the core until it is fully understood. Since
you didn't write the original code, I assume that you can't really explain
all the details of what is going on and why. I spent a lot of time looking at
this new feature and seeing if there was anything similar in AOLserver. What
I discovered was quite the opposite. There is not a single example where two
conns (conns, not sockets) are returning data to clients in the same thread.
It is possible that the AOL team once considered this, but none of the core
API support this in any way. Each is written to work in a single thread (or
block) and contains an internal event loop (in C). It is impossible to
imagine that the core could be rearranged to support a generic multiplexing
of client returns. When you survey how many different types of returns there
are, you will immediately understand why. The easiest way to put it is that
reading client data is essentially a binary read until passed to a connection
thread. Once the connection thread has all the binary data, it can do
conversions as needed.

AOLserver request model is a pipeline which is specialized per request. We can
consider this new feature as creating a branch where the original pipeline
continues (like after ns_conn close), and another branch sends data to the
client. But AOLserver has no control over the new branch. It could fail and
we would never know. There are no logs of failures. Actually, not quite true,
it looks like if the delivery fails for some reason, or if you don't call
[ns_conn contentsentlength $filelength], you get zero bytes in the
access.log.

On the surface, the patch appears (as described) to do something like
ns_returnfile. All the patch does is to make a copy of the connection and
wrap the sock in a Tcl_Channel. It doesn't create a worker thread, doesn't
pass the conn to this thread, it doesn't write headers or open the static
file. There is no way to signal for shutdown and to a thread join. All of
this work has to be done by the developer, including debugging, testing, and
everything else. So what we have is a new way for everyone to start doing
things in different ways, wondering why it doesn't work, complaining about
why it doesn't work the way they expect, etc. Some will probably offer their
own patches to make it work better.

When we accept the patch, we allow you to transfer all this mess to the
AOLserver community. Now we have to answer all these concerns, you don't.

This also short circuits discussion about the best way to do this in
AOLserver, and what to recommend to new users. Why? We just had another good
discussion about the performance of AOLserver for static content. We
discovered that the model for AOLserver works very well in this context, so
much so that we are kind of scrathing our heads over any need to think too
hard about performance improvements. There really are better things to work
on. There seems to be an obsession among new developers that performance
tweaking is the first thing to be considered when writing code. They think
about a few lines of code, one syscall vs another. But when you read the
AOLserver sources, it really starts to dawn on you: it ain't easy.
Fortunately it is already written, so why not use it to best advantage? Part
of that being done for you is the module system. If new functionality can be
added without patching the core it does two important things: it provides
examples of how to do it, and proves (as in tests) the sound design of the
core. If developers are encouraged to provide patches instead of modules it
works against these goals.

Anyway, I fully support the code in module form. I don't completely understand
the performance or security impact, but as a module, users can take it or not
by their own choice. If I had to do a similar task, I would probably create a
unique code and create a symlink to the actual file, then redirect the user
to that file using another instance of AOLserver. The other instance would
check the age of the symlink and serve the file if under some configured
amount. If the entire file gets served, remove the link. This way, you can
fully leverage the REST style, and it will likely release your expensive
connection thread even faster than the current example.

Using the code as a module (minus the ability to transfer the channel), here
is an example of use (the string bgwrite could easily be changed to anything
else, I chose it before I realized that the code doesn't do any bgwriting.):

# Get nContentSent (should be 0)
set contentsentlength [ns_bgwrite contentsentlength]
# Set nContentSent (to zero)
ns_bgwrite contentsentlength $contentsentlength

# Dup sock and wrap in Tcl_Channel
set channel [ns_bgwrite channel]

# File to return:
set dir [file dirname [info script]]
set image sns-thumb.jpg
set file [file join $dir $image]

# Get length of file:
set contentLength [file size $file]

# Get content type:
set contentType [ns_guesstype $file]

# Create last modified header data:
set lastModified [ns_httptime [file mtime $file]]

# Create header:
set header "HTTP/1.1 200 OK
Last-Modified: $lastModified
MIME-Version: 1.0
Date: [ns_httptime [ns_time]]
Server: AOLserver/4.5.0
Content-Type: $contentType
Content-Length: $contentLength
Connection: close
"

# Open file to send:
set inFd [open $file r]

# Configure channels:
fconfigure $inFd -translation binary
fconfigure $channel -translation binary

# put headers (could block forever):
puts $channel $header

# fcopy file to channel.
fcopy $inFd $channel -command [list close $channel]

#^^in a conn thread, this only sends the first 4096 bytes
# while in background mode (caused by -command switch)

# The following still happens regardless of fcopy status:

# Set nContentSent so log file will look normal:
ns_bgwrite contentsentlength $contentLength

# Log something
ns_log Notice "File $file has contentlength = $contentLength"

# No need to return, as [ns_bgwrite channel] has effect of [ns_conn close].

# Traces will still run on this thread
# what happens to keepalive?

### Log File Messages:
# error.log
[-default:0-] Notice: File /web/nsd45/servers/jnm/pages/sns-thumb.jpg has
contentlength = 28672
[-default:0-] Notice: running ::traceFilter, sleeping for '1' seconds

# access.log
192.168.1.102 - - [01/Oct/2007:06:58:23 -0700] "GET /nsbgwrite.tcl HTTP/1.0"
200 28672 "" ""

# access.log without setting nContentSent
192.168.1.102 - - [30/Sep/2007:22:00:25 -0700] "GET /nsbgwrite.tcl HTTP/1.0"
200 0 "" ""

tom jackson

On Saturday 29 September 2007 13:55, Gustaf Neumann wrote:
> Tom Jackson schrieb:
> > Okay, after looking further into this patch, I see that it doesn't
> > actually add any functionality to AOLserver. It looks like you would have
> > to install a newer version of OpenACS to use this.
>
> as i wrote in my earlier mail, the patch is simple and small and adds
> just two
> subcommands to ns_conn. Thanks to dossy, the patch is commited to cvs.
> head.
>
> The applications to "ns_conn channel" are on the tcl layer and
> are quite simple to use. Look into the xotcl-core package
> (xotcl-core/tcl/bgdelivery-procs.tcl) in the openacs cvs repository.
> With the patch, xotcl-core and libthread, one can replace e.g.
> ns_returnfile 200 $mime_type $filename
> by
> ad_returnfile_background 200 $mime_type $filename
> in cr_write_content in acs-content-repository/tcl/revision-procs.tcl
> to use the background delivery for content sent from the
> content repository (e.g. file store).

--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <list...@listserv.aol.com> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.

Dossy Shiobara

unread,

Oct 1, 2007, 1:11:28 PM10/1/07

to AOLS...@listserv.aol.com

On 2007.10.01, Tom Jackson <t...@RMADILO.COM> wrote:
> On the surface, the patch appears (as described) to do something like
> ns_returnfile. All the patch does is to make a copy of the connection
> and wrap the sock in a Tcl_Channel.

My only real concern about the change:

What happens if you try to [ns_chan create] on the Tcl channel returned
from [ns_conn channel]? Does the actual behavior line up with what one
might normally expect (i.e., Policy of Least Surprise)? Is the behavior
well-defined?

-- Dossy

--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)

Andrew Piskorski

unread,

Oct 1, 2007, 1:48:35 PM10/1/07

to AOLS...@listserv.aol.com

On Mon, Oct 01, 2007 at 09:40:19AM -0700, Tom Jackson wrote:
> Gustaf,
>
> I'm going to ask that the patch be removed and replaced with a module.

Why? What possible harm does including Gustaf's patch in the stock
server cause, rather than loading it as an additional module?

> Regardless of the number, we have just added two undocumented
> commands to the core code.

So what?

> We have a major problem with documentation here, and we know why:
> developers are not required (not even encouraged!) to do any work documenting
> new code or features.

Uh, so you'd rather have NEITHER useful new features nor
documentation? Because in general, I think that's what you're likely
to get.

And besides, Gustaf has written about his patch at length multiple
times over the last few years. The content is out there - in what way
is it not adequate? And just where is Gustaf SUPPOSED to add docs for
his new feature anyway?

> Second, the patch does nothing for AOLserver without at least
> figuring out where to get the ::thread package.

So what? Again, where's the harm?

And AOLserver should probably be changed somtime to always use and
ship with the Tcl Thread package anyway.

> So what we have is a new way for everyone to start doing
> things in different ways,

Everyone? No one is required to use Gustaf's feature, and indeed, the
only way to use it is by explicitly writing Tcl code to do so, right?

> wondering why it doesn't work, complaining about why it doesn't work
> the way they expect, etc. Some will probably offer their own patches
> to make it work better.

God forbid that anyone would offer potential improvements... Oh yeah,
I'm really worried about that!

> This also short circuits discussion about the best way to do this in
> AOLserver, and what to recommend to new users.

Tom, Gustaf has been both discussing and heavily using this one patch
for several years. Just how much more discussion do you want? It
obviously has been working well for U. Wien and others for years now,
so why not just adopt their proven patch as is, rather than screwing
around turning it into a loadable module?

What's the actual problem with Gustaf's code? You've obviously read
and thought about it, Tom (which I have not), but so far I see a lot
of theoretical hand wavy complaints from you, but little solid
criticism of the actual code.

--
Andrew Piskorski <a...@piskorski.com>
http://www.piskorski.com/

Tom Jackson

unread,

Oct 1, 2007, 2:00:52 PM10/1/07

to AOLS...@listserv.aol.com

Well there are some surprising things in this change. I'm not sure what the
spliceout variable means, but it is unrelated to the Ns_ and Tcl_ spliceout
functions, and I have no idea what they do yet.

But the nContentSent (and sock) member is private to a Conn, not even visible
in a Ns_Conn, so I was worried this could be abused somehow. But I also
didn't realize that nContentSent is set in the connection thread, not in the
worker thread, unless I'm wrong about this, which would mean that the worker
thread is directly setting nContentSent in another thread. I can't test this,
but you don't pass in a $conn to the call, so there can't be more than one
hanging around.

With a module it is also easier to hack away without needing to re-make
install nsd.

I just posted the module code so it is easier to examine in context:
http://rmadilo.com/files/nsbgwrite/

Note that there are some differences. Certain information is available via the
public API, and I have replaced that code with the public API call. I also
replaced 'int sock' with 'SOCKET sock' to reflect other uses.

One warning in the patch and module is in:
Conn *connPtr = (Conn *) conn;

The same code is used in a few other places in tclresp.c (I think), with the
same warning.

tom jackson

On Monday 01 October 2007 10:11, Dossy Shiobara wrote:
> On 2007.10.01, Tom Jackson <t...@RMADILO.COM> wrote:
> > On the surface, the patch appears (as described) to do something like
> > ns_returnfile. All the patch does is to make a copy of the connection
> > and wrap the sock in a Tcl_Channel.
>
> My only real concern about the change:
>
> What happens if you try to [ns_chan create] on the Tcl channel returned
> from [ns_conn channel]? Does the actual behavior line up with what one
> might normally expect (i.e., Policy of Least Surprise)? Is the behavior
> well-defined?

--

Dossy Shiobara

unread,

Oct 1, 2007, 2:34:26 PM10/1/07

to AOLS...@listserv.aol.com

On 2007.10.01, Andrew Piskorski <a...@PISKORSKI.COM> wrote:
> And just where is Gustaf SUPPOSED to add docs for
> his new feature anyway?

I will try to update doc/ns_conn.n soon and include the two new ns_conn
subcommands that Gustaf's patch adds.

> And AOLserver should probably be changed somtime to always use and
> ship with the Tcl Thread package anyway.

I'm not sure I agree with this. Not right now, anyway. But, this is a
conversation that I want to defer until I've had sufficient time to
think about it carefully.

-- Dossy

--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)

Jeff Rogers

unread,

Oct 1, 2007, 2:55:44 PM10/1/07

to AOLS...@listserv.aol.com

Andrew Piskorski wrote:

> What's the actual problem with Gustaf's code? You've obviously read
> and thought about it, Tom (which I have not), but so far I see a lot
> of theoretical hand wavy complaints from you, but little solid
> criticism of the actual code.

I also think the code belongs on a module rather than in the core.

My reason is because the code is the wrong solution. I'm not saying the
code is bad or that it doesn't work, because it obviously works and
solves a problem for alot of people. The problem is that with this
code, fixing a problem (simple network i/o blocking heavyweight
connection threads) requires doing something different at the script
level, meaning it is likely that some pages will use the fix and some
will not. This is a hack allowing a limitation to be worked around, not
a fix for the limitation. When the patch fixes the core so that all
connections (or selectively as specified in the config file) receive the
benefit, then it belongs in the core.

I see this as similar to the inclusion of gzip compression. Several
versions ago a module was contributed that added a new command
nz_zreturn to allow for gzip-compressed output. This was wrong because
someone would need to explicitly choose to use it on every individual
page. Including gzip compression in the core is the right way, and
that's where it is now. ns_return_background is fine to work around the
network writing problem for now, but it should be only a module; the
core patch should be a complete fix, one that applies the solution to
every connection using plain old ns_return.

Yes, we may need to wait a little while for this "real" fix. And when
we do create it, this patch becomes unnecessary except as backward
compatibility, which again will belong in a module, for those who need it.

-J

Tom Jackson

unread,

Oct 1, 2007, 3:04:06 PM10/1/07

to AOLS...@listserv.aol.com

On Monday 01 October 2007 10:48, Andrew Piskorski wrote:
> Tom, Gustaf has been both discussing and heavily using this one patch
> for several years. Just how much more discussion do you want? It
> obviously has been working well for U. Wien and others for years now,
> so why not just adopt their proven patch as is, rather than screwing
> around turning it into a loadable module?

Patching core code with experimental features is screwing around. There is
nothing wrong with screwing around with your own copy, but accepting this
type of code into core would lower the bar for any change. This code has
nothing to do with the normal workings of AOLserver. I just spent a few days
looking for any other example of this type of thing. Instead I found some
nice comments like this:

tclsock.c-766- /*
tclsock.c-767- * Pass a dup of the socket to the callback thread, allowing
tclsock.c-768- * this thread's cleanup to close the current socket. It's
tclsock.c-769- * not possible to simply register the channel again with
tclsock.c-770- * a NULL interp because the Tcl channel code is not entirely
tclsock.c:771: * thread safe.
tclsock.c-772- */
tclsock.c-773-
tclsock.c-774- sock = ns_sockdup(sock);

...

tclsock.c-1077- if (cbPtr->chan == NULL) {
tclsock.c-1078- /*
tclsock.c-1079- * Create and register the channel on first use. Because
tclsock.c:1080: * the Tcl channel code is not entirely thread safe, it's
tclsock.c-1081- * not possible for the scheduling thread to create and
tclsock.c-1082- * register the channel.
tclsock.c-1083- */
tclsock.c-1084-
tclsock.c-1085- cbPtr->chan = Tcl_MakeTcpClientChannel((ClientData) sock);

> What's the actual problem with Gustaf's code? You've obviously read
> and thought about it, Tom (which I have not), but so far I see a lot
> of theoretical hand wavy complaints from you, but little solid
> criticism of the actual code.

Both of these are used in the patch, and that doesn't mean there is a problem,
but these dup/copy are happening once per receiving thread. What happens when
multiple channels get pushed into a single thread? You also have to fake up
the nContentSent so that the access.log is 'maybe' correct.

Anyway, my criticism is limited to the fact that you don't need to patch the
core to add this feature, and the change does not at all match up with
anything else in AOLserver core. Is it okay to add anything? How about remove
anything? If Gustaf can add commands, why can't I remove them? Isn't this the
definition of anarchy? There is no logical basis for making a decision, just
the ability to do so. The module concept completely eliminates this problem
by dividing up responsibilities and giving it to those who are interested in
it. The idea that modules have owners (usually folks negotiate with the
maintainer) but the core is a free pass is pretty strange.

The mere fact that the code has been in use for several years does not mean
that this is the way it should go in the public community code.

tom jackson

Dossy Shiobara

unread,

Oct 1, 2007, 3:35:56 PM10/1/07

to AOLS...@listserv.aol.com

On 2007.10.01, Jeff Rogers <dv...@DIPHI.COM> wrote:
> I also think the code belongs on a module rather than in the core.

My vote is to leave [ns_conn channel] in the core officially, and mark
[ns_conn contentsentlength] as deprecated to warn folks from coupling
their code too tightly to it, as it may be removed in a future version.

While I think I agree with you (if I understand you correctly), that
introducing [ns_conn channel] as a solution the problem it's currently
being used to solve is not the best end solution, I do think having the
capability was something lacking in the ns_conn Tcl command for a long
time. The ability to interact with the socket connection as a plain Tcl
channel is something that's very useful (think: "Comet" style HTTP
server push implementations, etc.). However, I do agree that solving
the problem of freeing up connection threads while large HTTP responses
are being returned to the client should be handled deeper within the
server itself, not through developer-supplied application code.

-- Dossy

--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)

Vasiljevic Zoran

unread,

Oct 1, 2007, 3:43:03 PM10/1/07

to AOLS...@listserv.aol.com

This is all old stuff as since some time the Tcl channel
handling code is thread-safe and you can create channels
in one, de-register them and register them again in the
other thread.

Beside that, the patch that Gustaf added is neutral in the sense
what usage people will make out of it. It merely allows you to
re-use the connection socket from your Tcl code w/o the need to
add yet-another ns_return-like of command. You can use this for a
variety of purposes, not necessarily only for implementing a
single-thread-static-content server.
It is safe to use with [ns_channel] machinery as well...

Cheers,
Zoran

Tom Jackson

unread,

Oct 1, 2007, 5:42:22 PM10/1/07

to AOLS...@listserv.aol.com

On Monday 01 October 2007 12:35, Dossy Shiobara wrote:
> My vote is to leave [ns_conn channel] in the core officially, and mark
> [ns_conn contentsentlength] as deprecated to warn folks from coupling
> their code too tightly to it, as it may be removed in a future version.

Personally I don't think it has officially become part of core, neither API
has ever been released, unless you consider a cvs commit as released code,
AOLserver has never had these API, why not move both of them to a module
_before_ we have to warn anyone.

Please note that I have never even suggested to not accept the code, just how,
and I have done _all the work_ of writing a module. If the bar for accepting
code is simply that, the module version should qualify.

tom jackson

Rick Cobb

unread,

Oct 1, 2007, 8:55:44 PM10/1/07

to AOLS...@listserv.aol.com

Hmm. As the people who originated that kind of push, we do something a
little more straightforward -- but we've never integrated it to TCL
programming.

You may recall the "persistent connection" patch that was never applied
to AOLServer 3.4. In our case, what we asked to do was to remove a
connection from management by the regular connection pool(s), so we
could pass it to our own worker threads for long-term handling. Each of
our worker threads will typically handle thousands of these.

I generally like the ability to reassign connections; I'd have to defer
to Vlad Hociota to explain what core changes we needed to make to get
our stuff running on AOLServer 4.5; as I recall, the changes in the
driver model made our impact quite different.

-- ReC

Malte Sussdorff

unread,

Oct 1, 2007, 11:21:37 PM10/1/07

to AOLS...@listserv.aol.com

From an OpenACS users point of view who is also working closely
with ]po[, probably the major reason for AOLserver installations
nowadays, I just want this to work and easy to install. I assume the
module will also work on the Windows Version? I furthermore assume
the module does not have an performance impact in comparison to the
extension of ns_conn? Last but not least, what code do we have to
rewrite or is the module providing the two extensions to ns_conn?

I am just asking because in the end it might be just easier for us to
continue patching the AOLserver as we do anyway at the moment (http://
www.cognovis.de/developer/en/aolserver_install).

Malte

Gustaf Neumann

unread,

Oct 2, 2007, 5:50:14 AM10/2/07

to AOLS...@listserv.aol.com

Hi everybody,

Tom Jackson schrieb:

> Gustaf,
>
> I'm going to ask that the patch be removed and replaced with a module.

i got the - wrong - impression that you (Tom) revised your proposal to
change
the patch into a module, when you realized, that the patch is NOT
implementing background delivery, but something much more simple
and generic. In the xotcl-core OpenACS module, "ns_conn channel"
is as well used for purposes, quite different from background delivery,
namely for providing a COMET-style back-channel from the server to
the client
http://www.ajaxian.com/archives/comet-a-new-approach-to-ajax-applications

For example, the xotcl-core module uses this channel for implementing
a streaming chat (clients subscribe via HTTP to a named channel,
which is disconnected from the connection thread and transfered
to a spooling thread; messages sent to the channel are broadcasted
to all subscribers of the channels; the subscribers keep the channels
open and read asynchronously from the browser either via ajax or iframe).
This is not something we "could do in some future versions",
but something we have working since one and a half years.
http://openacs.org/forums/message-view?message_id=423582

The reason why i have described this is to show that the ability to
obtain from a connection thread a Tcl channel opens a much larger
range of applications than just the background delivery. The
application range is just limited by the creativity of a developer.

The ns_conn subcommands extend the connection "object"
ns_conn to provide a method to obtain the channel the Tcl
channel, and to query or set the sent-content-length (setting
is in my background delivery implementation is somewhat
over-optimistic, since errors during file delivery are not
reflected in access log (only in the error log). But only those
errors are missed in the access log, which are happening
between start of the asynchronous transfer and the close.
The most typical "error" is, when a longer transfer is interrupted
(e.g. user closes browser during transfer)). A maybe better
solution would be to extend the nslog module
to deal with asynchronous deliveries, such it is able
to receive a call, when the full content is delivered, and
to prevent logging in the trace immediately after a request
in a connection thread).

In general, i would prefer to see on the longer range
some convergence between the aolserver way and
"ordinary" tcl. Obtaining the tcl channel is a small
step this way.

Additionally, i would prefer aolserver and naviserver
to be as compatible as possible (all parties will profit
from this). As mentioned earlier, both ns_conn subcommands
are available already in naviserver.

My change sits only in the head version of the cvs tree.
Most probably, nobody is running a production environment
from head. So the potential harm is very limited. The
question is rather, whether or not this functionality should
be included in the next release of aolserver, and when
and how the next release will be made available. I was
quite surprised to see that the aolserver has 10 admins
and 58 developers registered at sourceforge (considered
the low number of changes over the last year). My feeling is
that whoever is and feels responsible for next release
should decide what patches should be included
in the next release or not. From my point of view, dossy
is the premier candidate.

> Regardless of the number, we have just added two undocumented commands to the
> core code.

I was not sure, whether the panoptic wiki intends to document
4.5 or cvs head. Most probably the intention is to document 4.5.
I mentioned this in my mail to dossy immediately after the
commit and wrote the commit message in a style suitable for
documentation (see below). Your rhetoric is somewhat unfair
(my be not willingly unfair due to limited information).

========================================== BEGIN COMMIT MSG
Extend ns_conn by two commands:

ns_conn channel

Return a Tcl channel from the current connection.
This channel can be used to talk via Tcl I/O to
the remote client afterwards.

ns_conn contentsentlength ?number?

Query or set the number of octets sent to the client. This command
can be used in connection with "ns_conn channel" to adjust the length
entry in a log file in cases, the file was sent via Tcl I/O.

"ns_conn channel" was originally written by Zoran.

========================================== END COMMIT MSG

> When we accept the patch, we allow you to transfer all this mess to the
> AOLserver community. Now we have to answer all these concerns, you don't.

Come on. The new subcommands empower the developer and
add missing functionality enabling the development of different
kinds of applications the tcl way. Don't patronize the developers by saying
"you might end up with non working code". By the same argument,
many commands the tcl api must be forbitten/removed/whatever.

I would accept arguments like "do it the aolserver way"
instead of the "tcl way", to
- provide ns_* means to detach a stream from a connection thread and
- transfer it to a spooling thread and
- reimplement there something like an ns_fcopy working on streams
without making the visible to the user. But why?

The committed changes aka "this mess" is working since a few
years reliably in production. i just looked it up: yesterday
background delivery was used more than 200.000 times on a
single server here. Are you trying to argue that the change is
reducing the reliability of the aolserver/openacs?
Talk to to nima, who was close to giving up on openacs due to
performance problems, especially on days when many file downloads
happened.

I should also mention that the code in the aolserver module works
together with the request-monitor modules of openacs, where you
see the ongoing foreground and background activities.

And how does "this mess" change, if there is a module just implementing
the two subcommands? you just end up with more maintenance,
more installation effort (updating aolserver startup scripts, updating
build processes, updating FAQs). Maybe it helps the ego of some
developer if one can say "use my module", something, i don't care
about.

> Using the code as a module (minus the ability to transfer the channel), here
> is an example of use (the string bgwrite could easily be changed to anything
> else, I chose it before I realized that the code doesn't do any bgwriting.):
>

How about trying to understand what the code does before making a module?

-gustaf neumann

Tom Jackson

unread,

Oct 2, 2007, 1:51:13 PM10/2/07

to AOLS...@listserv.aol.com

On Tuesday 02 October 2007 02:50, Gustaf Neumann wrote:
> > I'm going to ask that the patch be removed and replaced with a module.
>
> i got the - wrong - impression that you (Tom) revised your proposal to
> change
> the patch into a module, when you realized, that the patch is NOT
> implementing background delivery, but something much more simple
> and generic.

I wonder how many times I have to repeat myself, but I'll try again:

This 'simple' and 'generic' feature has nothing to do with ns_conn or how
connections are processed in AOLserver. The patch and the module do the same
thing, so what the code does is not of concern to me. It is an interesting
feature. I wish there were examples of how to use it with plain AOLserver
plus thread::transfer.

What we are really discussing is your resistence to putting the feature in the
correct place. And the reason for your resistence appears to be that it is
too much work for you. Even with the module completely written and
functional, that isn't enough. You want a special arrangement which saves you
from having to undo the bad decision to patch AOLserver for two years,
instead of writing a module.

Maybe we should ask why it took two years to submit such a simple patch?

I'm not interested in fame and glory or in saying 'use my module'. I'm trying
to provide an example of how to contribute to the AOLserver community in a
way which respects the extremely well developed API and the community.
Imagine what the sources would look like if the goal was to just do it as
easily as possible. There would be no ability to extend the server. Future
maintainers would have to carefully consider every change as it might have
difficult to detect consequences.

There are lots of things which are inconvenient in programming. But we do them
because in the long run they save us time and effort. The AOLserver sources
provide a good example of how inconvenient good programming can be.

Of all the talk about design patterns, I haven't ever heard about convenience
being one of these patterns. Why should convenience be the criteria for
adding new features to AOLserver?

> Additionally, i would prefer aolserver and naviserver
> to be as compatible as possible

I prefer to advocate for AOLserver, not naviserver, OpenACS or Tcl. And in
this case it would be nice if new code followed AOLserver coding norms. Maybe
you can get naviserver to take out their code. The module I wrote at least
compiled against their server, but nsd doesn't run on my 64bit laptop:

web/navi $ ./bin/nsd -f -t sample-config.tcl
[02/Oct/2007:10:37:45][6858.690331232][-main-] Notice: nsmain: Tcl version:
8.4.14
[02/Oct/2007:10:37:45][6858.690331232][-main-] Fatal: NsTclInitObjs:
sizeof(int) < sizeof(long)
Aborted

tom jackson

Vasiljevic Zoran

unread,

Oct 2, 2007, 3:41:50 PM10/2/07

to AOLS...@listserv.aol.com

On 02.10.2007, at 19:51, Tom Jackson wrote:

> web/navi $ ./bin/nsd -f -t sample-config.tcl
> [02/Oct/2007:10:37:45][6858.690331232][-main-] Notice: nsmain: Tcl
> version:
> 8.4.14
> [02/Oct/2007:10:37:45][6858.690331232][-main-] Fatal: NsTclInitObjs:
> sizeof(int) < sizeof(long)
> Aborted

This has been corrected since some time already.
CVS contains the correct code. The last released
code 4.99.2 is from 2006-02-04. Please update
your CVS sandbox and recompile.
We wanted to release 5.0 since some time but the
CVS head is so stable that we do not have any
pressing need so far.

Cheers,
Zoran

Stephen Deasey

unread,

Oct 2, 2007, 5:07:16 PM10/2/07

to AOLS...@listserv.aol.com

On 10/2/07, Tom Jackson <t...@rmadilo.com> wrote:

> ... it would be nice if new code followed AOLserver coding norms. Maybe

> you can get naviserver to take out their code. The module I wrote at least

> compiled against their server...

You're using symbols declared in nsd/nsd.h, which is private to the
server and not installed by default. Things in here change regularly.

You're asking for trouble...

> ...but nsd doesn't run on my 64bit laptop:

>
> [02/Oct/2007:10:37:45][6858.690331232][-main-] Fatal: NsTclInitObjs: sizeof(int) < sizeof(long)
> Aborted

That's *really* old code you're running...

...which is our fault for not making a release. Sorry.

You'll have more joy running cvs head and reporting bugs on the mailing list:

http://naviserver.sourceforge.net/w/Mailing_Lists

Tom Jackson

unread,

Oct 2, 2007, 7:42:35 PM10/2/07

to AOLS...@listserv.aol.com

Stephen,

Thanks for pointing that out. I was wondering about that.

I have posted new module code which avoids the private symbols.

But this removes the contentsentlength option.

I still wonder what the difference between dup'ing and not dup'ing is for
(spliceout vs. non-spliceout). In the original code, the not dup'ing was hard
coded in as the only option. The current code hard codes this to dup.

Anyway, I don't really know if this will work as the original, I was able to
fcopy in background a short text file, but larger than 4096 bytes just gets
that amount according to wget.

In foreground/blocking mode, fcopy returns larger files.

I've included my test script as well:

http://rmadilo.com/files/nsbgwrite/

So all this does now is to return the sock wrapped in a Tcl_Channel.

[ns_conn close] also has the effect of closing the channel. Maybe if it is
moved to another thread that will not happen?

tom jackson

Stephen Deasey

unread,

Oct 2, 2007, 9:23:17 PM10/2/07

to AOLS...@listserv.aol.com

On 10/3/07, Tom Jackson <t...@rmadilo.com> wrote:

> Anyway, I don't really know if this will work as the original, I was able to
> fcopy in background a short text file, but larger than 4096 bytes just gets
> that amount according to wget.
>
> In foreground/blocking mode, fcopy returns larger files.
>
> I've included my test script as well:
>

> http://rmadilo.com/files/nsbgwrite/nsbgwrite.tcl

By using the -command option to fcopy you've enabled background mode,
but I don't see a call to vwait. You're not entering the event loop.

4096 happens to be the default buffer size of a Tcl channel.

Tom Jackson

unread,

Oct 2, 2007, 10:58:15 PM10/2/07

to AOLS...@listserv.aol.com

Stephen,

Thanks, that worked. The -command was executed logging 'okay' + number of
bytes written.

But the ns_log statements following vwait are not executed. I assume that is
expected? I posted the updated script at:

http://rmadilo.com/files/nsbgwrite/nsbgwrite.tcl

tom jackson

Stephen Deasey

unread,

Oct 3, 2007, 9:27:09 AM10/3/07

to AOLS...@listserv.aol.com

On 10/3/07, Tom Jackson <t...@rmadilo.com> wrote:

> Thanks, that worked. The -command was executed logging 'okay' + number of
> bytes written.
>
> But the ns_log statements following vwait are not executed. I assume that is
> expected? I posted the updated script at:
>
> http://rmadilo.com/files/nsbgwrite/nsbgwrite.tcl

fcopy $inFd $channel -command [list ns_log Notice "okay"]
vwait $channel

vwait is expecting a variable *name*. Here, it's waiting for the
variable 'sock23' (or whatever) to be set, which never happens.

There's a short example on the man page:

http://www.tcl.tk/man/tcl8.4/TclCmd/vwait.htm

Tom Jackson

unread,

Oct 8, 2007, 2:07:29 PM10/8/07

to AOLS...@listserv.aol.com

I have a patch for conn.c and ns.h which enables the functionality of the
previous conn.c patch to be handled in a module, and/or written more clearly.
The patch is an extension of the external Ns_Conn* API, best seen in context
using the patch of ns.h:

Index: ns.h
===================================================================
RCS file: /cvsroot/aolserver/aolserver/include/ns.h,v
retrieving revision 1.86
diff -u -r1.86 ns.h
--- ns.h 7 Jul 2006 03:27:22 -0000 1.86
+++ ns.h 8 Oct 2007 14:11:36 -0000
@@ -659,6 +659,7 @@
NS_EXTERN char *Ns_ConnServer(Ns_Conn *conn);
NS_EXTERN int Ns_ConnResponseStatus(Ns_Conn *conn);
NS_EXTERN int Ns_ConnContentSent(Ns_Conn *conn);
+NS_EXTERN int Ns_ConnSetContentSent(Ns_Conn *conn, int nContentSent);
NS_EXTERN int Ns_ConnResponseLength(Ns_Conn *conn);
NS_EXTERN Ns_Time *Ns_ConnStartTime(Ns_Conn *conn);
NS_EXTERN char *Ns_ConnPeer(Ns_Conn *conn);
@@ -666,7 +667,10 @@
NS_EXTERN char *Ns_ConnLocation(Ns_Conn *conn);
NS_EXTERN char *Ns_ConnHost(Ns_Conn *conn);
NS_EXTERN int Ns_ConnPort(Ns_Conn *conn);
-NS_EXTERN int Ns_ConnSock(Ns_Conn *conn);
+NS_EXTERN SOCKET Ns_ConnSock(Ns_Conn *conn);
+NS_EXTERN int NS_ConnSetSock(Ns_Conn *conn, SOCKET sock);
+NS_EXTERN int Ns_ConnSetSockInvalid(Ns_Conn *conn);
NS_EXTERN char *Ns_ConnDriverName(Ns_Conn *conn);
NS_EXTERN void *Ns_ConnDriverContext(Ns_Conn *conn);
NS_EXTERN int Ns_ConnGetKeepAliveFlag(Ns_Conn *conn);

I also updated the module to use the new API so that it functions the same as
ns_conn channel and ns_conn contentsentlength.

Everything is available at http://rmadilo.com/files/nsbgwrite/

There is also an associated Tcl module which creates the worker thread using
thread::create, storing the id in an nsv_array, along with a counter to track
the number of uses. An example tcl page takes requests and passes them off to
the worker thread. The tcl page works with both [ns_conn channel] and
[ns_bgwrite channel].

I used apache-bench to saturate the driver, threadpool and worker queues,
using the default threadpool with a maxthreads of 10.

The summary of requests/sec for concurrencies from 1-100 (28k image):

Requests per second: 163.53 [#/sec] (mean) -n 1 -c 1
Requests per second: 187.54 [#/sec] (mean)
Requests per second: 217.56 [#/sec] (mean)
Requests per second: 309.01 [#/sec] (mean)
Requests per second: 318.85 [#/sec] (mean)
Requests per second: 300.49 [#/sec] (mean)
Requests per second: 284.14 [#/sec] (mean)
Requests per second: 409.18 [#/sec] (mean)
Requests per second: 484.74 [#/sec] (mean)
Requests per second: 400.49 [#/sec] (mean)
Requests per second: 421.17 [#/sec] (mean)
Requests per second: 386.16 [#/sec] (mean)
Requests per second: 387.18 [#/sec] (mean)
Requests per second: 540.77 [#/sec] (mean)
Requests per second: 509.06 [#/sec] (mean)
Requests per second: 497.58 [#/sec] (mean)
Requests per second: 390.64 [#/sec] (mean)
Requests per second: 407.27 [#/sec] (mean)
Requests per second: 398.91 [#/sec] (mean)
Requests per second: 725.58 [#/sec] (mean)
Requests per second: 765.01 [#/sec] (mean) -n 2000 -c 20
Requests per second: 761.57 [#/sec] (mean) -n 2000 -c 40
Requests per second: 685.17 [#/sec] (mean)
Requests per second: 724.04 [#/sec] (mean) -n 2000 -c 75
Requests per second: 588.27 [#/sec] (mean)
Requests per second: 766.57 [#/sec] (mean) -n 4000 -c 100

Full results and summaries are at:
http://rmadilo.com/files/nsbgwrite/logs/

Everything seems to work pretty well. Although the response time goes up with
concurrency, the timing in the server log (log.txt) shows that this is mostly
due to time waiting in a queue, not in processing time, as both time in the
tcl page and the worker thread are about equal or less than the average
response time.

One issue which seems strange is that with concurrency > 1, there is an
additional request showing up in the logs (actually (concurrency -1)
additional requests). These end in error, the sock is not available. Apache
doesn't show that it sends these and then disconnects, but they show up even
at prequeue in the driver thread. It may be a bug in apache bench. Whatever
the cause, it doesn't appear to be anything more than the client
disconnecting at some intermediate step. The exact same behavior is seen with
[ns_conn channel] using the cvs head version. Or I have a bug in my tcl code.

New APIs:

Ns_ConnSetSock can be used for two purposes. One is to change the sock used by
a connection. The other is to signal that the sock is no longer valid, so the
connection pipeline will do something different. (Not sure exactly what is
different, but at least the Conn doesn't try to close the sock, since it
doesn't know about it anymore. But Ns_Sock still exists.)

Ns_ConnSetSockInvalid is an easier to understand version of Ns_ConnSetSock, it
simply calls Ns_ConnSetSock(conn, INVALID_SOCKET).

The other new API is Ns_ConnSetContentSent. I'm not too sure about this API,
but it does allow you to log the expected number of bytes sent to the client
when you handle the return outside of a conn. There are no good other options
at the moment. the access log module looks pretty unfriendly to additional
inputs.

ns_conn:

Instead of being such an ass on this, I'll remove my objection to the new
ns_conn channel, etc. Tcl API, but I think both should be deprecated for
future use, replaced with one or more modules which use the new Ns_Conn APIs.
The main reason being that ns_conn channel does way too much to be in conn.c.
The real missing feature was the ability to invalidate the sock (or make
Ns_Sock think the socket was invalid) so that it wouldn't be cleaned up
immediately (and restore the conn sock if something goes wrong while creating
a channel).

async without Thread package?:

After looking at too much code, task.c and tclhttp.c (which gives ns_http)
caught my eye. It turns out that tclhttp.c is a very complicated example of
how to use task.c (Ns_Task). But the purpose of ns_http is very different
than handing off client returns to a single worker thread. In fact, ns_http
does a lot of _extra_ work to ensure that each http request is handled by a
single (not the queue) thread. But multiple threads could all work at the
same time using the same Ns_Task queue.

So, the bottom like appears to be that Ns_Task could be used to create a
worker thread. Adding a task to a queue triggers the event loop if it isn't
already running. It should be very easy to write the callback for the queue
to initialize by transfering the sock, if it hasn't already been transfered,
and opening the file to send, creating the headers, etc.

The task event loop works by doing one callback operation for each sock that
is ready, like copy the read buffer to the write buffer. The fact that the
callback proc doesn't have to do anything but very simple operations makes it
easier to program, but this was far from apparent with the ns_http example,
which uses blocking wait for reading. Ns_Task breaks up the 'what to do' from
the 'where to do it' and 'when to do it'.

tom jackson

Reply all

Reply to author

Forward