How to compare game engines

Aleksey Linetskiy

unread,

Mar 14, 2002, 5:50:56 PM3/14/02

to

The recent discussions about the new game engine developed by Kodrik made
me think - how we can compare different game engines? I do not want the
comparison to be done in terms "My favorite system rules, and all other
suck".

Here is the suggestion for the experiment, which can gather some
interesting data about different game engines, and, therefore, may be
beneficial to the developers of both the new and old engines.
-------------------------------------------------------------------------
For the testing we require :
a) Detailed description of some simple game - couple of rooms, several
puzzles, one NPC would be enough.
b) Several developers; each should be an expert in at least one engine.
-------------------------------------------------------------------------
STAGE 1. Programming.
Purpose: To measure effort required to implement proposed game using each
engine.

Each programmer develops game in his/her engine of choice. Each
programmer measures time required to do this.

Note: I do not think the time effort should be measured with great
precision. Difference in several hours or even days is definitely not
important.
-------------------------------------------------------------------------
STAGE 2. Debugging.
Purpose: To measure effort required to debug the game.

The games are submitted to several beta-testers.

Note: I've included this section separately for one reason. After reading
description of Kodrik's engine, I've got a feeling that, though it can be
relatively easy to program a game in his system, the debugging can become
very hard, due to the great number of key sequences and their possible
combinations.
-------------------------------------------------------------------------
STAGE 3. User testing.
Purpose: To get some statistics regarding how good the engine is for
users.

The game submitted to several INEXPERIENCED users. Their input is logged
and analysed later for the following statistics:

- Percentage of successful commands (the command is entered and
recognized by the engine correctly).
- Percentage of diagnosed errors (incorrect command was entered; the
engine diagnosed the problem).
- False recognitions (correct command was entered; the engine recognized
it incorrectly).
*** This item may require additional input by tester. For example, after
this happens, we can ask the player to enter some special command).
- False diagnostics (user entered something which he/she considers to be
a valid command; the engine treats it as an error).

After the game, we can also ask the users to provide some comments and to
grade their level of satisfaction.

Note: IMHO this stage should provide the most interesting results. By
comparing error statistics, we can find areas for improvement in
different engines.

Aleksey "F" Linetskiy

Kodrik

unread,

Mar 14, 2002, 6:46:00 PM3/14/02

to

> Here is the suggestion for the experiment, which can gather some
> interesting data about different game engines, and, therefore, may be
> beneficial to the developers of both the new and old engines.
> -------------------------------------------------------------------------
> For the testing we require :
> a) Detailed description of some simple game - couple of rooms, several
> puzzles, one NPC would be enough.
> b) Several developers; each should be an expert in at least one engine.

I think this would be great. A complex little game that shows a lot of
game features expected by the players, like NPC interaction.

I think the specs should be chains of event, rather than an implementation
you have to copy. Each will implement it the way it is best handled in
their engine and can show off some tricks. Ideally, the specs should be
written and published before the game is implemented to any engine.
I don't like the fact that with COD, they are supposed to be clones. It
just says: "this engine can do it" instead of saying "taste the difference".

> -------------------------------------------------------------------------
> STAGE 1. Programming.
> Purpose: To measure effort required to implement proposed game using each
> engine.
>
> Each programmer develops game in his/her engine of choice. Each
> programmer measures time required to do this.
>
> Note: I do not think the time effort should be measured with great
> precision. Difference in several hours or even days is definitely not
> important.

I don't think there will be a difference in time for people who master the
engine. I think a tutorial for each engine on how to do this program would
be the best way to show the difference and what method is closer to the
heart of the user.

> -------------------------------------------------------------------------
> STAGE 2. Debugging.
> Purpose: To measure effort required to debug the game.
>
> The games are submitted to several beta-testers.
>
> Note: I've included this section separately for one reason. After reading
> description of Kodrik's engine, I've got a feeling that, though it can be
> relatively easy to program a game in his system, the debugging can become
> very hard, due to the great number of key sequences and their possible
> combinations.

That is a good one but I think it would not be a fair comparaison at this
time. I have tools to optimize beta testing, since the author can work live
on his game and get feedback live, a lot more can be done faster. For
another engines you would have to email changes as you make them.
So eventhough there is more to do with my engine in beta-testing, it is
done faster and more efficiently.
Another factor are the librairies, if they are present for what you want to
do, a lot of the work and debugging has already been done. So there wil be
a big difference in debugging a game using existing librairies and
debugging a game without the needed librairies. And of course when a game
is implemented like this one, the librairies for its elements are created
at the same time so redoing an game that has already been done with the
engine is easy, which is a false impression if you want to code your own
game.
I think it's easier to make it clear to everyone that setting the keys at
this point is an unreasonable amount of work with my engine.

> -------------------------------------------------------------------------
> STAGE 3. User testing.
> Purpose: To get some statistics regarding how good the engine is for
> users.
>
> The game submitted to several INEXPERIENCED users. Their input is logged
> and analysed later for the following statistics:
>
> - Percentage of successful commands (the command is entered and
> recognized by the engine correctly).
> - Percentage of diagnosed errors (incorrect command was entered; the
> engine diagnosed the problem).
> - False recognitions (correct command was entered; the engine recognized
> it incorrectly).
> *** This item may require additional input by tester. For example, after
> this happens, we can ask the player to enter some special command).
> - False diagnostics (user entered something which he/she considers to be
> a valid command; the engine treats it as an error).
>
> After the game, we can also ask the users to provide some comments and to
> grade their level of satisfaction.

I'm not sure you how to handle false recognition since the engine think it
is doing the right thing.

> Note: IMHO this stage should provide the most interesting results. By
> comparing error statistics, we can find areas for improvement in
> different engines.

In my case it is a built in feature because I am server based, but for
local programs that would mean asking the user to upload their stats
themselves to some kind of repository. I'm not sure how many would.

OKB -- not okblacke

unread,

Mar 14, 2002, 9:26:00 PM3/14/02

to

Kodrik kod...@zc8.net wrote:
>I think the specs should be chains of event, rather than an implementation
>you have to copy. Each will implement it the way it is best handled in
>their engine and can show off some tricks. Ideally, the specs should be
>written and published before the game is implemented to any engine.
>I don't like the fact that with COD, they are supposed to be clones. It
>just says: "this engine can do it" instead of saying "taste the difference".

The Cloak of Darkness specification doesn't specify an implementation. It
specifies what you're supposed to implement. My impression has been that Cloak
of Darkness is intended to serve, not as a showcase for the "tricks" of each
programming language, but as a way of comparing the differences in coding
"flavor". The truth is that most of an average game is not nifty tricks but
fairly mundane code.

It would be interesting to see several "free" implementations of a less
specific specification, but it wouldn't really be a comparison of the kind that
Cloak of Darkness is; indeed beyond a certain level of liberality in the spec,
you can just compare the source codes of games released in different languages.
(This is making me think of WalkthroughComp -- what about a similar comp where
the walkthrough was not open to artistic interpretation?)

--OKB (Bren...@aol.com) -- no relation to okblacke

"Do not follow where the path may lead;
go, instead, where there is no path, and leave a trail."
--Author Unknown

Kodrik

unread,

Mar 14, 2002, 9:57:06 PM3/14/02

to

> (This is making me think of WalkthroughComp -- what about a similar comp
> where the walkthrough was not open to artistic interpretation?)

That could be really great. A spec comp?
And the winner have his story ported to every engine, if there are
volunteers for the port of each engine.
The authors can concentrate on the content without being restricted by an
engine. It will up to the engines than to resort to whatever they support
to have an implementation as close as the specs.

But a format will have to be agreed to write the specs in. This format
could be very useful for authors planning their stories, every not part of
the comp.

Plugh!

unread,

Mar 15, 2002, 2:01:53 AM3/15/02

to

Aleksey Linetskiy <gar...@saintly.com> wrote in message news:<MPG.16faf33f1...@news.cis.dfn.de>...

> For the testing we require :
> a) Detailed description of some simple game - couple of rooms, several
> puzzles, one NPC would be enough.
> b) Several developers; each should be an expert in at least one engine.

Well, you are obviously going to get lots of people mentioning Roger
Firth's "Cloak of Darkness"
(http://www.firthworks.com/roger/cloak/index.html) here.

It lacks the NPC, but that it probably best for what it attempts to
achieve. Roger's intention is to define a very simple piece which is
then implemented in each language so that a potential authour can view
each of them and decide (rather subjectively), which one he perefers.

Of course, most people reading this know that already, so why did I
bother contributing to my ever worsening RSI?

Adding an NPC to Cloak would add quite a degree of complexity, which
wouldn't really help it's purpose.

Otoh, I do agree that it would be useful to know how a language
handles NPCs. But things like default reaction to standard commands is
more a fucntion of the library, than of the language. And splitting
that hair causes me to ask if you really want to compare game engines,
as the title of the thread states, or development systems?

On the third hand, I'd personally like to see a feature comparison
grid. I'll leave it as an exercise for the reader to specify which
features should be listed.

Jim Aikin

unread,

Mar 16, 2002, 12:06:02 AM3/16/02

to

Aleksey Linetskiy wrote:

> The recent discussions about the new game engine developed by Kodrik made
> me think - how we can compare different game engines?

This is an interesting topic, but your proposal contains an implicit
assumption -- namely, that it's possible to measure the differences
between game engines in an objective way (qualitatively -- I certainly
don't mean "measure" quantitatively). It seems to me, contrariwise, that
an engine that's terrific for you might suck for me, and vice-versa.
Also, an engine that's terrific for game scenario A might suck for game
scenario B.

There are simply too many factors. Every week or so (I work for Keyboard
magazine) I hear from a reader who wants to know which synthesizer is
"the best." The answer is, "Best for what?"

Also, while I agree that it's useful to get feedback from inexperienced
players, I suspect that their feedback will tell you a lot more about
what the game programmer did or didn't do by way of implementation than
about the engine.

--Jim Aikin

Peter Seebach

unread,

Mar 16, 2002, 1:08:50 PM3/16/02

to

In article <3C92D2B2.508@kill_spammers.org>,

Jim Aikin <kill_spammers@kill_spammers.org> wrote:
>There are simply too many factors. Every week or so (I work for Keyboard
>magazine) I hear from a reader who wants to know which synthesizer is
>"the best." The answer is, "Best for what?"

Picking up chicks! What else would you use a synthesizer for?

-s
--
Copyright 2002, all wrongs reversed. Peter Seebach / se...@plethora.net
$ chmod a+x /bin/laden Please do not feed or harbor the terrorists.
C/Unix wizard, Pro-commerce radical, Spam fighter. Boycott Spamazon!
Consulting, computers, web hosting, and shell access: http://www.plethora.net/

Gary Shannon

unread,

Mar 16, 2002, 2:05:21 PM3/16/02

to

> In article <3C92D2B2.508@kill_spammers.org>,
> Jim Aikin <kill_spammers@kill_spammers.org> wrote:
>There are simply too many factors. Every week or so (I work for Keyboard
>magazine) I hear from a reader who wants to know which synthesizer is
>"the best."

Korg N5. ;-)

--gary

Plugh!

unread,

Mar 16, 2002, 2:42:53 PM3/16/02

to

Jim Aikin <kill_spammers@kill_spammers.org> wrote in message news:<3C92D2B2.508@kill_spammers.org>...

> Aleksey Linetskiy wrote:
>
> It seems to me, contrariwise, that
> an engine that's terrific for you might suck for me,

Stating the bleedin' obvious (but I couldn't have said it better
myself :-)

> There are simply too many factors. Every week or so (I work for Keyboard
> magazine) I hear from a reader who wants to know which synthesizer is
> "the best." The answer is, "Best for what?"

I'm not interested in keyboards, but camera, computer, etc, mags
regularly patronize us with articles about what's best for certain
types of user (beginner, keen amateur, pro).

I do think that a feature comparison grid might be useful and then we
can make our own subjective judgements from that. Magazines somehow
manage to compare products while realizing that the reviews will be
read by end-users of varying skills.

btw, it's the first sunny Saturday of the year here in Germany & I've
been down the beergarden all day, so I hope that I don't come over too
badly in electrons - no offense intended.

Jim Aikin

unread,

Mar 16, 2002, 7:54:35 PM3/16/02

to

Peter Seebach wrote:

>>There are simply too many factors. Every week or so (I work for Keyboard
>>magazine) I hear from a reader who wants to know which synthesizer is
>>"the best." The answer is, "Best for what?"
>
> Picking up chicks! What else would you use a synthesizer for?

I have some bad news for you. The guitar players get all the chicks.

--JA