On 12/17/2013 11:23 PM, Brian Hauer wrote:
> As for code size and complexity, we do not have a GitHub issue for
> that but we have been making some headway on that internally. We are
> considering measuring things like source lines of code (sloc)
Would this depend on, say, a regular expression for each language
identifying where comments appear, to avoid counting them?
> As you can imagine, a full application test type is considerably more
> difficult to specify and implement. I have been hesitant with the
> idea in the past because of the amount of labor involved. However, I
> could change my opinion there, especially considering how generous the
> community has been with providing test implementations.
It might be useful to change the whole perspective for this project, to
the point where you expect the community of each framework to provide a
good implementation of each benchmark, rather than thinking of a small
set of benchmark developers doing everything. That seems to be how the
venerable Great Programming Languages Shootout worked, for instance.
The people contributing code have natural incentives to work hard to
maximize quality! (And with the high profile of the TechEmpower
benchmarks now, I think framework developers will see the value of
working on benchmarks, where they may not have when you began.)
One more pair of suggestions I wanted to add, to reduce friction for
some frameworks in processing requests:
1. The current specification requires inclusion of "Server" and "Date"
headers in responses. Is this necessary? It requires a bit of
marginally ugly code in implementations for frameworks that run
bare-bones HTTP servers that don't habitually include such headers. We
would typically throw a proxy in front of such a server in a realistic
deployment, but I think it's useful to benchmark the underlying server
directly in this sort of context.
2. The current specification requires, for some benchmarks, processing
of "an integer query string parameter named queries." However, the
benchmark_config parameters like "query_url" seem to indicate that the
infrastructure will be flexible about formatting; perhaps no code
changes would be necessary to support arbitrary URL formats that just
end in the numeric "queries" parameter? The specification also requires
applying a default interpretation in case of a missing or malformed (not
an integer) parameter, but it's not clear if benchmarking actually uses
that flexibility. Frameworks that adopt a higher-level view of web apps
(like Ur/Web) may require more work than others to accommodate such a
fixed, ad-hoc URL format; Ur/Web would be happier with URIs like
"/queries/20", and no explicit request processing code would be
required, in contrast to the current benchmark implementation.
So, would it actually be easy to change both these sorts of handling?
Maybe it doesn't even break backwards compatibility, if it would solely
be a relaxation of the problem specification?
A last question: would it be substantially easier to implement new
features if participants who are interested in them combined to pay a
modest amount of money for the work to be done? This could prompt
allegations of biasing results in favor of the frameworks whose
developers contribute financially, but I figured I'd put it on the table.