Anybody out there have any idea what size vocabulary the Infocom
text adventures had? I'm thinking largely of the later ones, not early
games like Zork.
I'm asking 'cos I'm working on a TA myself and right now, although I'm
not even a fraction way through constructing the damn thing I've got
to around 2000 words plus. There's a bit of overlap there, but it
includes all nouns, verbs, adjectives, pronouns and articles. And I'm
going to add a lot more words (mainly nouns & adjectives now; think
my verb structure is fairly set. Hopefully) before I'm through.
So I'm just wondering what is considered to be a "large" vocabulary
by text adventure standards.
- Neil K. (neil_...@sfu.ca)
Hope this helps you get an idea for your vocabulary size.
Most Infocom games ranged from around 1500 to 2000, if I'm not
mistaken. Some of the earlier games like Starcross had smaller
vocabularies; perhaps 1200 words.
I'm not aware of any game that has a vocabulary larger than 2000 words,
though one might exist. From personal experience 2000 words seems to
be a common limit. (It's unusual to have enough objects to take you
beyond that limit, but certainly possible.)
Where'd you get these numbers? Infocom claimed higher values for at
least some of those games, unless my memory has failed me. E.g.,
Starcross: claimed 1200 words.
I know, but these figures come from directly decoding the dictionaries
in the game datafiles themselves, so quite probably they are accurate.
There are several reasons why Infocom might be claiming more than that.
1. There may be some intelligence in the parser which allows it to at
least do some simple derivation of words (i.e. pluralization aso.) and
Infocom might count that as additional words.
2. Possibly Infocom counts some common word combination as additional
3. 1200 words sounds a lot better than 557 words. And for a long time
infocom had no reason to assume that readers for their datafile were
Try to get more ambitious than that and you'd buckle a feeble computer
like the 64.
I remember reading many years back about Infocom text adventures in
Discover and they said that basically the games are huge sets of rules
interacting with one another.
There is no explicit statement that says: if x and y or z then a, in other
words, not a huge collection of clauses, but rather assertions which
allow things to happen if true.
There can sometimes be holes in the logic. In Sorcerer, I had a scroll which
made things travel through time. I cast in on a hawker in the amusement
park and it told me the hawker disappeared. But when I tried to enter the
booth, it said that the hawker pushed me away, but no other verb with
the hawker as the object would acknowledge the existence of the hawker.
600 words is enough to make for a very interesting and powerful parser,
at least for me. I look forward to yours.
/* Ice's Hypermedia Sig */ #include <cyberpunk.h> #include <industrial.h>
Hardware required: biological neural net with _unsupervised_learning_
Audio() Burning Inside by Ministry; "The Mind is a Terrible Thing to Taste"
Visual() Sarah Connor's flesh on fire blasted away leaving screaming skeleton
Before you get too smug, just remember that your game doesn't have to
run in 48K! :-)
> I am curious how the term "word" is defined, however. For example,
>if I use the word "tie" as a noun, an adjective and a verb, is that
>one word or three?
Personally, I'd say "tie," "tie," and "tie" are different words as
you've described them. (BTW, I don't really know how TADS counts, so I
don't actually know how the Unnkulian vocab sizes were derived.)
> For instance, in my TA there are at least a couple of synonyms (sometimes
>five or six) for each verb. And each item has tons of adjectives and
>synonymous nouns. Why? Because it really annoys me when I'm playing
>a TA and it says "I don't know the word 'fish'" for every other command
>I type. Particularly when a word is used in the description of a place
>or a thing.
This last bit is crucial, I think. A good rule of thumb is "if the
game uses a noun in a description, the parser should recognize that
noun." Few games really adhere to this since it's a bit of a pain, and
probably impossible if your game has to run in 48K. Now that we have
lots more memory, we should see games that are better about this. I've
found that even metaphorical things like "tendrils of light" are worth
putting nouns in for, since you can at least give the player an
obnoxious message about how silly they are for examing EVERY SINGLE
THING in the room description. :-)
Of course, you can always give very minimalistic room descriptions,
thereby cutting down on the number of "decorations" you need to put in,
but somehow this isn't very satisfying. This whole problem seems a bit
similar to the standard text adventure problem of limiting where the
player can go because you don't want to write descriptions for a
million rooms that aren't important. In D.A. Leary's latest game
"Horror of Rylvania," for example, all the windows in a particular
building are nailed shut. Sure it's an interesting part of the
setting, but it's mostly that way so he doesn't have to deal with what
happens when you go through all the windows! I think you're going
to end up doing some of this in any adventure game. (BTW, Horror of
Rylvania will be released Real Soon Now.)
> I have been thinking that maybe I have sort of gone overboard, though.
>As it is things like walls, ceilings, streetlights, sidewalks, curbs,
>etc, are objects which can be referred to in game play.
This is obviously somewhat philosophical, but I'd say that you've done
the right thing here. I'd almost go so far as to say that you can't
have too much setting detail, particularly if you've got a good plot to
go along with it. When you compare adventure game settings to setting
development in novels, you begin to see how lacking most adventures are
in this regard. The degenerate case of "You are in a room. Exits lead
..." is sometimes not so different from what we really see in some games.
Note also that setting developement, in contrast to character
development, is one of those things that's nearly as easy to do in
interactive fiction as it is to do in normal fiction. If you're
willing to put 10-20 decorations in every location I think you can
paint a pretty convincing picture. (Of course, this totals over
1000 decorations in a reasonably-sized game!)
>My game dis-
>tinguishes between floors and the ground, includes clothing for the
>player, etc. The intention is to be more realistic, but I realize
>I've got to be careful not to push things too far with totally
>irrelevant detail. Hard to draw the line, though...
You can end up spending your life adding more and more layers of
detail. WIth clothing, for example, you could paint yourself into a
corner trying to add more and more realism. For example, will the game
let you take off your shirt and wear it as a big bandana on your head?
(I assume not. :-) But you can see how this can get totally out of
hand. Obviously you want to have everything you put in the game
further the exposition somehow, even if it's only to create a picture
or general impression of the game world in the player's mind.
I would tend to go with Carl's figures regardless of what Infocom claimed
in Status Line or anywhere else. I suspected he got the numbers from
the data files but didn't want to beg the question.
>It's a stupid thing to argue about, anyway. If someone has old Status
>Lines and can check, go ahead and send email to the above parties about
>it. I can see how a game designer would want to make sure she was in
>the right ballpark with vocabulary size (i.e., double-check that she's
>not being really stupid), but I don't think it's a very important
>topic all around. The correct amount of vocabulary is simply "whatever
Huh? It's important because someone's writing a game and wants to know
(for curiosity's sake perhaps) what standard games have in the way of
vocabulary. No one's making any claims about the importance of
vocabulary size in games. What's the problem?
A Mind Forever Voyaging 609 1813
Arthur 325 1060
Ballyhoo 239 962
Beyond Zork 448 1569
Border Zone 351 803
Bureaucracy 255 1416
Cutthroats 223 790
Deadline 255 656
Enchanter 255 723
Hollywood Hijinx 244 854
Infidel 249 613
Journey 434 27
Leather Goddesses 267 1031
Moonmist 253 955
Nord and Bert 286 1230
Planetfall 257 696
Plundered Hearts 223 816
Seastalker 255 916
Sherlock 314 1194
Shogun 413 1389
Sorcerer 255 1012
Spellbreaker 63 850
Starcross 239 557
Stationfall 255 789
Suspect 255 674
Suspended 191 676
The Hitch Hiker's Guide 236 1019
The Lurking Horror 252 773
The Witness 251 715
Trinity 593 2120
Wishbringer 254 1063
Zork I 260 692
Zork II 250 684
Zork III 222 564
Zork Zero 574 1624
I *believe* Infocom made a big deal about sorcerer being their first
game with a >1000 word vocabulary, in the Status Line / New Zork Times
(whichever it was at the time.) Starcross certainly didn't have
enough stuff in it to need 1200 words, I think. Also, I'm pretty
sure they didn't do any derivations.
It's a stupid thing to argue about, anyway. If someone has old Status
Lines and can check, go ahead and send email to the above parties about
it. I can see how a game designer would want to make sure she was in
the right ballpark with vocabulary size (i.e., double-check that she's
not being really stupid), but I don't think it's a very important
topic all around. The correct amount of vocabulary is simply "whatever
P.S. Actually, I guess the "interactive fiction plus" series may well
have had siginificantly larger vocabularies...
Boy, that's weird, though. Never thought I'd write a text adventure
with 5x the vocabulary of Zork III!
I am curious how the term "word" is defined, however. For example,
if I use the word "tie" as a noun, an adjective and a verb, is that
one word or three? My vocabulary for my TA is nearly 2000 if you exclude
all overlap, but pushing 2300+ if you count word overlap.
As for the importance of stressing vocabulary size. I suppose I agree
that the actual size of the vocabulary is not, in itself, the most
important thing. But I think it definitely improves the playability of
For instance, in my TA there are at least a couple of synonyms (sometimes
five or six) for each verb. And each item has tons of adjectives and
synonymous nouns. Why? Because it really annoys me when I'm playing
a TA and it says "I don't know the word 'fish'" for every other command
I type. Particularly when a word is used in the description of a place
or a thing. It's just another unwanted reminder that we're dealing
with a rather stupid computer parser.
I have been thinking that maybe I have sort of gone overboard, though.
As it is things like walls, ceilings, streetlights, sidewalks, curbs,
etc, are objects which can be referred to in game play. My game dis-
tinguishes between floors and the ground, includes clothing for the
player, etc. The intention is to be more realistic, but I realize
I've got to be careful not to push things too far with totally
irrelevant detail. Hard to draw the line, though...
- Neil K. (neil_...@sfu.ca)
>Actually I can believe 4000 words, though it requires some force of will. :-)
Do you mean 4000 words for each game? I can't get TADS to count my
game's vocabulary, although its statistics feature is supposed to. Every
time I try it it tells me I have 0 of everything - words, superclasses,
strings, etc. Maybe there's a message there...
>Well I think this might be another example of the problems with defining
>what a "word" is. But is it still possible to buy Amnesia? Do they
>have it out for the PC? Sounds like an interesting game, though I'm
>not sure I really *want* to explore all of Manhattan, *especially* not
>the parts underground.
I remember seeing ads for that years ago. Isn't it the game in which
you supposedly wake up in an alley or something and you have to figure
out who you are, etc? I was a bit annoyed when I heard of this game
(and Deja Vu for the Mac and also, I guess, Infidel from Infocom)
because I thought the amnesia conceit might be an appropriate one for
a text adventure. I was planning one years ago (before I'd heard of these
other games) based more or less on that theme. You were a tax collector
and you had to collect years of backtaxes from someone, but you had
to play the game to figure that out.
Of course the game I'm working on now (less half-heartedly this time)
also has a generally amnesiac theme. And so I'll be accused of doing
something completely unoriginal. The penalty of messing around for
years instead of actually writing something...
- Neil K.
Apple ][??? Is that still a marketing constraint? (If so, I'm glad!
Big memories make for lazy coders...)
What version of TADS are you using? I've never had a problem with the
stats, though Leary has run into trouble with the 64K segmentation.
With a .GAM file of around 230K he gets really weird behavior out of
TADS 1.2 caused by RAM shortage. (The code segment is now too large
to fit in one segement.)
Anyway, Mike Roberts tells me these things will be fixed in TADS 2.0,
which he says will support *virtual memory*. Pretty nifty.
> Of course the game I'm working on now (less half-heartedly this time)
>also has a generally amnesiac theme. And so I'll be accused of doing
>something completely unoriginal.
Few things are created in a vacuum. I wouldn't worry about people
thinking your game is unoriginal.
Apple ][, Atari 8-bit, C-64, etc. were all 48K (and sometimes 64K and
128K) machines that would run Infocom games. These days I don't think
memory is much of a problem for text games, nor is running on a 48K
machine a marketing consideration.
>In article <1992Mar23....@ils.nwu.edu> bar...@ils.nwu.edu (Jorn Barger) writes:
>>d...@wam.umd.edu (David M. Baggett) writes:
>>> Before you get too smug, just remember that your game doesn't have to
>>> run in 48K! :-)
>>Apple ][??? Is that still a marketing constraint? (If so, I'm glad!
>>Big memories make for lazy coders...)
>Apple ][, Atari 8-bit, C-64, etc. were all 48K (and sometimes 64K and
>128K) machines that would run Infocom games. These days I don't think
>memory is much of a problem for text games, nor is running on a 48K
>machine a marketing consideration.
I was going to respond to this one, actually, but decided not to bother.
But... what the heck. Well, I was more surprised than smug re: having
a big vocabulary in my TA. (and yes, I did see the smiley. Didn't want
to be accused of overreacting and not recognizing humour when I see
it!) But I respond because I thought it wasn't so much the memory
constraints of these systems so much as the mass media storage space.
As I recall Infocom games loaded in code segments off floppy as necessary
- fairly innovative for the day. This meant that they could run off
a 143K floppy on an 48K Apple ][. Plus, of course, they did simple
text compression on the text to cram even more onto each disk.
But it is kind of strange to run TADS under MultiFinder on a Mac, since
it demands two megs (!) of memory. It doesn't actually seem to use
that ridiculous amount of space, so I've set the memory preference
down a bit, but still...
- Neil K. (neil_...@sfu.ca)
>What version of TADS are you using? I've never had a problem with the
>stats, though Leary has run into trouble with the 64K segmentation.
>With a .GAM file of around 230K he gets really weird behavior out of
>TADS 1.2 caused by RAM shortage. (The code segment is now too large
>to fit in one segement.)
I'm running TADS 1.2.07 on an elderly Mac Plus. Unfortunately the TADS
interface was designed for a command-line system, and it's a bit
annoying to run it on a desktop system. Largely because it quits
after every compile. The -p option pauses, but it still quits once
you press a key. Anyway, tagging the stats option (-s, isn't it? I
don't have TADS in front of me) yields a list of various parameters
for my game; each one having 0 as the value. Weird. And I'm rapidly
approaching this 230K limit you mention... Great!
>Anyway, Mike Roberts tells me these things will be fixed in TADS 2.0,
>which he says will support *virtual memory*. Pretty nifty.
Sounds good. I mailed him a list of a zillion features I'd like to
see... Wonder when 2.0 will be out. (of course, it'll mean a new
version of ADV.T and since I completely rewrote ADV.T it means more
time spent updating code... Oh, well.)
>Few things are created in a vacuum. I wouldn't worry about people
>thinking your game is unoriginal.
Yeah. Guess that sounds pretty neurotic, eh?
- Neil K. (neil_...@sfu.ca)
It's not really the size of the .GAM file so much as the size of the
code segment, which is only one part of the .GAM file. In your case
you probably have loads of strings in there taking up space. TADS
seems to handle loads of strings OK, just not loads of code. (And 32K
of TADS "t-code" is quite a lot.) Incidentally, I think TADS stores
the strings compressed. If you look at the .GAM files with a hex
editor you'll find only the strings enclosed in single quotes in the
source code in there.
Warning: you may have problems when you go to compile your game on the
PC. Because of its brain-damaged memory "management" scheme it seems
to be more finicky than the Mac and Atari. It might be worth trying on
a friend's PC right now to see if you've already passed the "magic
I'm hoping 2.0 will be out and reasonably debugged before I'm done my
current game. Virtual memory seems like the _right_ way to do things;
then games can run in 64K and not have problems on any of the
machines. (It would be neat to fire up a TADS game on an ancient
IBM PC (i.e., 4 Mhz!) with only 256K of RAM and no hard drive...)
>Wonder when 2.0 will be out. (of course, it'll mean a new
>version of ADV.T and since I completely rewrote ADV.T it means more
>time spent updating code... Oh, well.)
That's true. Actually, I remember discussing ways to obviate modifying
ADV.T at some point; perhaps it was in this group. Anyway, allowing
the TADS programmer to say (somehow) "OK, take all that stuff in ADV.T
and change the following select things" would help a lot. Then we
wouldn't have to keep updating our ADV.T's to keep up with TADS
> [...] Incidentally, I think TADS stores
>the strings compressed. If you look at the .GAM files with a hex
>editor you'll find only the strings enclosed in single quotes in the
>source code in there.
Yeah. I'm a bit unhappy about that. Not only are nouns, adjectives,
verbs, etc., visible, but also random event strings since they're
single quoted. I don't like that much since it tells the user what
vocab words are available and this may reveal puzzles and such. I also
like the idea of the user having to discover for him or herself what
things are in the game. (you did mean "I think TADS stores the
strings *un*compressed, didn't you?)
>Warning: you may have problems when you go to compile your game on the
>PC. Because of its brain-damaged memory "management" scheme it seems
>to be more finicky than the Mac and Atari. It might be worth trying on
>a friend's PC right now to see if you've already passed the "magic
I'm still pondering whether to release the thing (whenever it's
completed...) on anything other than the Mac. I don't have regular
access to a PC and no access at all to an Atari which means I wouldn't
be able to support users as well as I could on the Mac. And I'd feel
uncomfortable tossing a program out to the world without the ability
to support it properly... Hmmm.
> [...] Anyway, allowing
>the TADS programmer to say (somehow) "OK, take all that stuff in ADV.T
>and change the following select things" would help a lot. Then we
>wouldn't have to keep updating our ADV.T's to keep up with TADS
That would be nice. It took four or five hours to update my ADV.T
since I changed just about every object class in there, defining new
attributes for objects, etc. And fixing the #($&# sitting on things
bug. I'm sure you're familiar with this thing - I think I saw it in
both Unnkulians, but I don't remember as I haven't actually spent
the time to play them. (sorry, Dave! One day...) Anyway, if you're
sitting on something like a chair and then say SIT ON FLOOR then
you're suddenly IN THE CHAIR IN THE GROUND which makes no sense
to me. (Maybe I'm missing the point. Perhaps this is just another
cheesy ACMEism. :)
- Neil K. (neil_...@sfu.ca)
No, TADS stores double-quoted strings *compressed*. The single-quoted
ones, which include vocab words and certain other uses (which you
mention) are stored uncompressed. I'm not sure why they're not
all compressed, but at least most of them are.
> I'm still pondering whether to release the thing (whenever it's
>completed...) on anything other than the Mac. I don't have regular
>access to a PC and no access at all to an Atari which means I wouldn't
>be able to support users as well as I could on the Mac. And I'd feel
>uncomfortable tossing a program out to the world without the ability
>to support it properly... Hmmm.
PC's are so ubiquitous these days it seems like you should be able to
find someone who'd let you compile your game on one and test a
walkthrough. Granted, that's not as good as a full test, but you can
always give beta versions to testers you get through the net if you
want. There are plenty of people who are willing to beta-test text
adventures these days. :-)
If you're going to make it shareware it'll be a good idea to
at least get it running on the PC and Mac. Those are the big markets.
> I'm sure you're familiar with this thing - I think I saw it in
>both Unnkulians, but I don't remember as I haven't actually spent
>the time to play them. (sorry, Dave! One day...) Anyway, if you're
>sitting on something like a chair and then say SIT ON FLOOR then
>you're suddenly IN THE CHAIR IN THE GROUND which makes no sense
Actually I hadn't noticed that, and no one's ever mentioned it in
letters or email. (Tells you something about how hard it can
be to test these games thoroughly.)
Though we had noticed that the way ADV.T handles "the ground" is
completely weird in general. We just don't use "the ground" for
anything, so few people ever mess with it. I do recall that I had to
change something so it would say "In the tire swing" instead of "On the
tire swing" in UU2.
>(Maybe I'm missing the point. Perhaps this is just another
>cheesy ACMEism. :)
It certainly is. Anything that is obviously wrong that you didn't
intend is an ACMEism for sure. :-)