I think that measuring the total program runtime is okay.
First of all, nobody will use a "slow" language if their only concern
is performance. There are a lot of other, measurable and not, metrics,
attracting users to higher-level languages.
Secondly, if reading an input file is a problem, then there is a
problem with the input file. I believe the input files should never be
too large, i.e. linear-time parser, however slow, should always pass
the time limit.
However, to make performance competition fun for all, even really
slow, languages, It would be great to have an additional per-language
ranking of solutions (e.g. top 10 Haskell solutions).
Now, on the "TopCoder" approach. Yes, it eliminates the common I/O
overhead, but the complexity of implementing the correct time
measurement and cheat-protection grows greatly!
One problem I can see with (naive implementation of) I/O wrappers in
Haskell: a slow program will often instantly return its output as a
lazy list, thus passing the time limit, but the output printer/
verifier will have a hard time reading this list.
Another, security-related, problem is that under the TopCoder model
the timing decision will be made by the process containing the user
code. Thus there is no easy way of making user unable to forge the
timing information and this problem would need to be solved in every
supported language separately.
Now, if we look at the TopCoder model from the other angle, the
programmer convenience, then it becomes a great idea! It eliminates
the tedious I/O code writing for every submission at the cost of
defining a simple input format specification once per problem (which
should be useful for the problem setter himself!) and a parser
generator for each language. If we forget about "fair time-measurement
with pre-parsed input", it becomes even more convenient because then
we can make usage of generated parsers optional (like Google AI
Challenge starter packages), and it will allow us to not worry about
writing a parser generator for every language because the users will
be able to write their own.
Another advantage I see is that it will simplify the input verifier
greatly! It will only have to check that the numbers/sizes lie within
the proper bounds. All the format checks will be automated.
Summary of what I would like to see:
* Total run time measurement with a single process-runner (as before:
robust, fair, and secure)
* Per-language ranking of solutions in addition to a cross-langage one
* A simple TopCoder-style input specification for problems (every
problem, or should we allow exceptions?)
* An optional input parser for every language, generated from the
input specification for the given problem