Code Explaination: Spelling correction code

Drew

unread,

Apr 11, 2007, 10:41:16 PM4/11/07

to

I recently saw this website: http://www.norvig.com/spell-correct.html

All the code makes sense to me save one line:

def known_edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in
NWORDS)

I understand (from seeing a ruby version of the code) that the goal
here is to rerun the edits1 method on each member of the set returned
by running edits1 on the initial word. However, I'm confused how the
for within a for works. Can anyone help shed some light on this for me?

Steven Bethard

unread,

Apr 11, 2007, 11:27:42 PM4/11/07

to

Drew wrote:
> I recently saw this website: http://www.norvig.com/spell-correct.html
>
> All the code makes sense to me save one line:
>
> def known_edits2(word):
> return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in
> NWORDS)

This is the same as:

result = set()

for e1 in edits1(word):
for e2 in edits1(e1):

if e2 in NWORDS:
result.add(e2)
return result

The thing between the ``set(`` and ``)`` is called a generator
comprehension if you'd like to look into it further.

STeVe

Drew

unread,

Apr 12, 2007, 8:41:20 AM4/12/07

to

Steve -

Thanks for the response. I'm somewhat familiar with generator/list
comprehension but was unsure how multiple statements were evaluated
when chained together. From your explanation, I'm assuming they are
evaluated from the "inside out" rather than left to right or right to
left.

Does the mean that the comprehension on the inside is always evaluated
first?

Thanks,
Drew

Steven Bethard

unread,

Apr 12, 2007, 10:28:04 AM4/12/07

to

Drew wrote:
> On Apr 11, 11:27 pm, Steven Bethard <steven.beth...@gmail.com> wrote:
>> Drew wrote:
>>> def known_edits2(word):
>>> return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in
>>> NWORDS)
>>
>> This is the same as:
>>
>> result = set()
>> for e1 in edits1(word):
>> for e2 in edits1(e1):
>> if e2 in NWORDS:
>> result.add(e2)
>> return result
>>
>> The thing between the ``set(`` and ``)`` is called a generator
>> comprehension if you'd like to look into it further.
>

> Thanks for the response. I'm somewhat familiar with generator/list
> comprehension but was unsure how multiple statements were evaluated
> when chained together. From your explanation, I'm assuming they are
> evaluated from the "inside out" rather than left to right or right to
> left.
>
> Does the mean that the comprehension on the inside is always evaluated
> first?

Not really (at least for the most literal interpretation of ``evaluated
first``). I find it easiest to think of translating them into regular
for loops by adding the appropriate indentation.

Starting with:

(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

Adding newlines:

(e2

for e1 in edits1(word)
for e2 in edits1(e1)
if e2 in NWORDS)

Adding indentation:

(e2

for e1 in edits1(word)
for e2 in edits1(e1)
if e2 in NWORDS)

Moving the add/append to the bottom:

for e1 in edits1(word)
for e2 in edits1(e1)
if e2 in NWORDS

e2

Adding the remaining boiler-plate:

result = set()
for e1 in edits1(word):
for e2 in edits1(e1):
if e2 in NWORDS:
result.add(e2)

So multiple for- and if-expressions are evaluated in the same order that
they would normally be in Python, assuming the proper whitespace was added.

HTH,

STeVe

Drew

unread,

Apr 12, 2007, 4:17:27 PM4/12/07

to

Wow, thanks for having the patience to write that out. This makes
perfect sense now.

-Drew