Solution to optional AI class NLP problem

2 views

Skip to first unread message

Caleb Madrigal

unread,

Dec 22, 2011, 12:46:08 AM12/22/11

to geekbo...@googlegroups.com

File attached. This python script downloads a random book as a source of training data in order to find the most common letter in the english language (which, of course, comes out to be 'e'), and then finds the most common cipher-letter in the cipher text (which comes out to be 'p'), and then does subtracts the two letters to find the caesar cipher offset, and then does the decryption.

This question was simple enough for this very naive approach to work, but it would be interesting to go the next step and build an entire probabilistic language model.