Language Shootout on 28 Dec 2007

1 view
Skip to first unread message

Uffe Seerup

unread,
Feb 27, 2008, 6:39:46 AM2/27/08
to tirania.org blog comments.
The C# version in the Language Shootout uses "Compiled" - i.e. it is
instructed to compile the regular expression. Now, if Mono is anything
like MS .NET, compiled should be taken literally - the class library
actually produce IL code which is subsequently JITed. This is a
radically different meaning of "compiled" that are used by the other
languages where the term simply means that an internal structure is
prepared for processing rather than the raw text version og the regex.

The truly compiled version of course has much better performance, but
it takes longer to compile. A developer will only use the compiled
option for long-lived regex'es. Otherwise he will use the non-compiled
option - which corresponds to compiled in the other languages.

The shootout uses the regex once and throws it away. Hence, C# takes a
hit for an actual compilation. The implementation should be changed to
use non-compiled regular expressions.

Michael Hutchinson

unread,
Feb 27, 2008, 11:54:35 AM2/27/08
to tirania.org blog comments.
On Feb 27, 6:39 am, Uffe Seerup <u...@wizionary.dk> wrote:
> The C# version in the Language Shootout uses "Compiled" - i.e. it is
> instructed to compile the regular expression. Now, if Mono is anything
> like MS .NET, compiled should be taken literally - the class library
> actually produce IL code which is subsequently JITed. This is a
> radically different meaning of "compiled" that are used by the other
> languages where the term simply means that an internal structure is
> prepared for processing rather than the raw text version og the regex.

I believe that Mono's regex parser does 'compile' the regex to an
internal representation for the interpreter. However, the version of
Mono used by the shootout does not have an true regex compiler like
you describe, and invoke "Compile()" actually has no effect; the regex
continues to use the interpreter.

> The truly compiled version of course has much better performance, but
> it takes longer to compile. A developer will only use the compiled
> option for long-lived regex'es. Otherwise he will use the non-compiled
> option - which corresponds to compiled in the other languages.
>
> The shootout uses the regex once and throws it away. Hence, C# takes a
> hit for an actual compilation. The implementation should be changed to
> use non-compiled regular expressions.

Actually, it depends on the amount of processing the regex needs to
do. Recently the IL compiler was implemented (though it is still not
compiled into Mono by default) and it was found that in this
benchmark, the speed increase of the compiled regex actually outweighs
the compilation cost by a very significant factor:
http://www.advogato.org/person/lupus/diary/26.html
Reply all
Reply to author
Forward
0 new messages