Chris (
http://twitter.com/broady) just sent this through to me :
http://microformatique.com/optimus/
Code is available, and looks like it can do a bunch of the heavy
lifting to parse the microformats, etc. That said, it's not that hard
to parse the data already, but why re-invent the wheel?
Code is available on Google Code at
http://code.google.com/p/mf-optimus/
I'm thinking that it might be easier/more efficient/smarter to use
Optimus to parse the pages into an XML format, and then just consume
that. We'll still need to do some work to normalise it for the
database, but at least all of the quirks with parsing the microformats
would be handled.
What do you think?
Jason