I thought she was pretty lucky.
I started to wonder what the probability was of being able to make a 7-
letter word (i.e. using all 7 tiles in your rack) with the
distribution of letters in Scrabble. The number of each letter in the
game is: A9 B2 C2 D4 E12 F2 G3 H2 I9 J1 K1 L4 M2 N6 O8 P2 Q1 R6 S4 T6
U4 V2 W2 X1 Y2 Z1 Blank2
Of course it all depends on what letters are already on the board, and
whether your 7-letter word can fit in somewhere, so I'm not suggesting
that it is an easy calculation, but approximately.
I guess it also depends on the size of your dictionary, but we were
using the TWL dictionary.
Any thoughts?
Ciao,
Chappy.
(In the UK, we'd call that a bonus play, btw.)
>After I had my
> turn she proceeded to make another 7-letter word on her next turn.
That's two lucky picks!
> I started to wonder what the probability was of being able to make a 7-
> letter word (i.e. using all 7 tiles in your rack) [...]
After the first turn, you can, of course, use all 7 of your tiles
together with what's on the board to make a word of 8 or more letters.
In last month's Scrabble Club News, Philip Nelkon quotes some stats
resulting from James Cherry's getting 2 computers to play each other
10,000 times. Among these stats: 16% of games have a bonus played on
the first move.
As you say, even if you have a 7-letter word on your rack, there is
the matter of where, if anywhere, you can play it. Another factor is
that, after a player draws 1-6 tiles, the probability that his rack
has a 7-letter word might very well be higher than what it is for
one's opening rack if the player manages his rack well. For example,
no 7-letter word can be made from AEIRSTX. A player with that on the
rack might well play XI not only for the points, of course, but also
because there is a high probability that the other 5 tiles AERST plus
the two he draws will make a 7-letter word.
What is easy is to write a program to calculate the probability of
the tiles on one's rack being able to produce a 7-letter word.
I include such a program below. I used some brute force in the
interest of writing it quickly; on vex's shell machine it still
runs in about a minute on /usr/share/dict/words.
> I guess it also depends on the size of your dictionary, but we were
> using the TWL dictionary.
I ran it on /usr/share/dict/words, as I say, which on this machine is an
edition of the well-known online word list claimed to have been legally
derived from Webster's New International Dictionary, 2nd Edition. The
complete list is 235,882 words. And the results were:
7-letter words in dictionary: 20552
2017799913/16007560800 = 12.61%
However, there are two major sources of error here. First, the
word list includes a large number of obscure words, which in
practice few people would know. Chappy may have intended these
to be included in computing the probability he asked for, but
their existence means that the realistic probability is lower.
Looking at a random 50-word sample:
beatify bedirty beveler bindweb braided bugfish cangler
chateau chitose citrate countor drooper escolar forging
futchel garnets gluteus habitus kolkhos nibsome outback
ovarium pealike phacoid plessor provand quietly rammish
reaward reblade recroon regrind ruelike scatula scogger
silency snarler spanemy squatly stopped stunter talkful
turmoil typhoon unheler upclose uromere viatica wriggle zunyite
I find that about 30% of these are ordinary words, another 30%
are unlikely formations obviously derived from from ordinary
words (for example, "rammish" and "recroon"), and the remaining
40% are obscure words. As an experiment, I took the list of
7-letter words and randomly selected 30% of them, and then 60%,
to see what results the program would then produce. I figured
they'd be fairly close to 30% and 60% of the 12.61% probability
computed above, and so they were:
7-letter words in dictionary: 6165
987563508/16007560800 = 6.17%
7-letter words in dictionary: 12330
1508478180/16007560800 = 9.42%
The second and possibly more important source of error is that
the the word list does not include ordinary inflected forms like
"numbers" and "claimed", which are perfectly good words for Scrabble.
I simulated the effect of these inflected forms by randomly taking
60% of the 6-letter words in the dictionary and adding them to the
dictionary with "s" appended. (Most of the words will either be
nouns or verbs, and a large majority of these take -s.) Obviously
this will not be an exact result, but I think it's a reasonable
approximation. The result was:
7-letter words in dictionary: 29526
2312423993/16007560800 = 14.45%
Perhaps someone else would like to run this code on other dictionary
files.
--------
Mark Brader, "It is impossible. Solution follows..."
Toronto, m...@vex.net -- Richard Heathfield
My text and code in this article are in the public domain.
--------
#!/bin/sh
case $# in
1) ;;
*) echo "Usage: $0 dictionary" >&2; exit 1;;
esac
exec 4>&1 # escape from pipe
# 7-letter words in dictionary
perl -ne 'print if /^[a-z]{7}$/' "$1" |
# Brute-force computation of tile-sets with 2 possible blanks
perl -ne '
chomp;
@c = split //;
print sort(@c), "\n";
for $i (0..6) {
for $j ($i..6) {
local $c[$i] = local $c[$j] = " ";
print sort(@c), "\n";
}
}
END {
open TTY, ">&4";
print TTY "7-letter words in dictionary: $.\n";
}' |
sort -u |
# Ways to generate each tile-set
perl -e '
sub choose {
my ($n, $r) = @_;
my $p = 1, $q = 1;
while ($r > 0) {
$p *= $n--;
$q *= $r--;
}
return $p/$q;
}
%avail = (
" " => 2, "a" => 9, "b" => 2, "c" => 2,
"d" => 4, "e" => 12, "f" => 2, "g" => 3,
"h" => 2, "i" => 9, "j" => 1, "k" => 1,
"l" => 4, "m" => 2, "n" => 6, "o" => 8,
"p" => 2, "q" => 1, "r" => 6, "s" => 4,
"t" => 6, "u" => 4, "v" => 2, "w" => 2,
"x" => 1, "y" => 2, "z" => 1, );
while (<>) {
my %got;
chomp;
foreach $c (split //) { ++$got{$c}; }
$n = 1;
for $g (keys %got) {
$n *= choose ($avail{$g}, $got{$g});
}
$tot += $n;
}
$universe = choose (100, 7);
printf "$tot/$universe = %.2f%%\n", 100*$tot/$universe;
'
Not necessarily. If the board was this:
S
T
O
R
M
You can still make a 7-letter word with 7 tiles:
S
T
O
R
M
SCOGGER
Although strangely, my word list from OSPD (Official Scrabble
Players Dictionary) has no words larger than 8 letters. You would
think they would know better.
>
> In last month's Scrabble Club News, Philip Nelkon quotes some stats
> resulting from James Cherry's getting 2 computers to play each other
> 10,000 times. Among these stats: 16% of games have a bonus played on
> the first move.
>
> As you say, even if you have a 7-letter word on your rack, there is
> the matter of where, if anywhere, you can play it. Another factor is
> that, after a player draws 1-6 tiles, the probability that his rack
> has a 7-letter word might very well be higher than what it is for
> one's opening rack if the player manages his rack well. For example,
> no 7-letter word can be made from AEIRSTX. A player with that on the
> rack might well play XI not only for the points, of course, but also
> because there is a high probability that the other 5 tiles AERST plus
> the two he draws will make a 7-letter word.
ObPuzzle: What 7-letter word CAN'T you make even if you go first?
ENABLE word list, which includes most inflected forms, as well
as misconjugated or misspelt foreignisms:
7-letter words in dictionary: 23208
2027838359/16007560800 = 12.67%
The difference is mere noise.
Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration
Great puzzle. Alas, I resorted to brute force and ignorance.
Method used as spoiler space.
I used this:
$ perl -e '%avail = (
"a" => 9, "b" => 2, "c" => 2,
"d" => 4, "e" => 12, "f" => 2, "g" => 3,
"h" => 2, "i" => 9, "j" => 1, "k" => 1,
"l" => 4, "m" => 2, "n" => 6, "o" => 8,
"p" => 2, "q" => 1, "r" => 6, "s" => 4,
"t" => 6, "u" => 4, "v" => 2, "w" => 2,
"x" => 1, "y" => 2, "z" => 1, );
foreach(keys(%avail)){print("\$x=tr/$_//d;\$s+=(\$x>$avail{$_})?\$x-$avail{$_}:0;\n")if($_ ne " ");}
'
to produce the innards of this:
$ perl -ne '$s=0;chomp;next if(length!=7);$w=$_;
$x=tr/w//d;$s+=($x>2)?$x-2:0;
$x=tr/r//d;$s+=($x>6)?$x-6:0;
$x=tr/a//d;$s+=($x>9)?$x-9:0;
$x=tr/x//d;$s+=($x>1)?$x-1:0;
$x=tr/d//d;$s+=($x>4)?$x-4:0;
$x=tr/j//d;$s+=($x>1)?$x-1:0;
$x=tr/y//d;$s+=($x>2)?$x-2:0;
$x=tr/u//d;$s+=($x>4)?$x-4:0;
$x=tr/k//d;$s+=($x>1)?$x-1:0;
$x=tr/h//d;$s+=($x>2)?$x-2:0;
$x=tr/g//d;$s+=($x>3)?$x-3:0;
$x=tr/f//d;$s+=($x>2)?$x-2:0;
$x=tr/t//d;$s+=($x>6)?$x-6:0;
$x=tr/i//d;$s+=($x>9)?$x-9:0;
$x=tr/e//d;$s+=($x>12)?$x-12:0;
$x=tr/n//d;$s+=($x>6)?$x-6:0;
$x=tr/v//d;$s+=($x>2)?$x-2:0;
$x=tr/m//d;$s+=($x>2)?$x-2:0;
$x=tr/s//d;$s+=($x>4)?$x-4:0;
$x=tr/l//d;$s+=($x>4)?$x-4:0;
$x=tr/c//d;$s+=($x>2)?$x-2:0;
$x=tr/p//d;$s+=($x>2)?$x-2:0;
$x=tr/q//d;$s+=($x>1)?$x-1:0;
$x=tr/b//d;$s+=($x>2)?$x-2:0;
$x=tr/z//d;$s+=($x>1)?$x-1:0;
$x=tr/o//d;$s+=($x>8)?$x-8:0;
print("$s : $w\n")if($s>2);' < $DICT
Which produced this:
3 : pizzazz
Which seems to make sense.
#!/usr/bin/perl
use strict;
use warnings;
no warnings 'syntax';
my $NR_OF_TILES = 7;
my $NR_OF_BLANKS = 2;
my $DEF_FILE = "/usr/share/dict/words";
#
# English distribution.
#
my %tiles = qw [a 9 b 2 c 2 d 4 e 12 f 2 g 3
h 2 i 9 j 1 k 1 l 4 m 2 n 6
o 8 p 2 q 1 r 6 s 4 t 6 u 4
v 2 w 2 x 1 y 2 z 1];
@ARGV = $DEF_FILE unless @ARGV;
while (<>) {
chomp;
next unless /^[a-z]{$NR_OF_TILES}$/;
my $missing = 0;
while (my ($char, $count) = each %tiles) {
$missing += s/$char/$char/g - $count if s/$char/$char/g > $count;
}
print "$_\n" if $missing > $NR_OF_BLANKS;
}
__END__
bazzazz
bezzazz
bizzazz
pazzazz
pizzazz
Replacing the tile distribution with the distribution of the Dutch language,
and a list of Dutch words, I could not find a single 7 letter word that
could not be made; not any shorter word that could not be made.
I did find a six letter (English) word that could not be made: kakkak.
And my word list also contains 'zzzz', which cannot be made either.
#
# Dutch distribution
#
my %tiles = qw [a 6 b 2 c 2 d 5 e 18 f 2 g 3
h 2 i 4 j 2 k 3 l 3 m 3 n 10
o 6 p 2 q 1 r 5 s 5 t 5 u 3
v 2 w 2 x 1 y 1 z 2];
Abigail
Oh, pshaw! Have you never heard of Asterix the Gaul?
(Yes, I know that Scrabble forbids proper names -- but
there is nothing "proper" about Asterix and Obelix and
their fellow Gauls, nor the others in their adventures
like the Roman generals Nefarius Purpus and Crismus Bonus,
the Egyptian architects Edifis and Artifis, the Goths
Hemisferic and Allegoric, the legionaries Sendervictorius
and Appianglorius, ...)
--
Eric Sosman
eso...@ieee-dot-org.invalid
Yes, the Consolidated list from puzzlers.org had all of those,
but I could only find pizzazz in my dictionary. I have the
Consolidated list in an Access database where each word also
has a signature, it's letter frequency. Pizzazz would be
10000000100000010000000004. I then made a query to compare
each signature against the signature of the Scrabble bag
using a criteria that selects 7-letter mis-matches. Only
these were returned.
I used the latest version of the Collins Tournament and Club word list
(the official word list for Scrabble tournaments in the UK) and got
the following figures. I break the figures down by b, the number of
blanks in the rack; for each b, I give the number of racks that can
make a word and the probability of drawing such a rack with b blanks:
b n racks probability
0 1346660287 0.0841265140
1 1036579562 0.0647556224
2 57137678 0.0035694181
total 2440377527 0.1524515545
This word list has 32340 7-letter words. However, for the purpose of
this exercise, a set of words that are anagrams of each other count in
effect as one (what matters is whether a word can be made from the
rack, not how many words). After each set of anagrams has been reduced
to a single word, there are 26075 words left, of which 25648 can be
made from 7 letter tiles and the other 427 are words such as GAGGING,
CONCOCT and KICKING which need at least one blank.
I was at a Scrabble club meeting a couple of years ago here in Atlanta
and Ron Tiekert and Ray Smith started out a game with 5 consecutive
bingos. I remember their combined score at the end was over 1100.
Mark