Here is a simple example:
http://code.google.com/p/cascading/wiki/CrawlDataWordCountCascade
Note that you will need a custom Scheme that knows to read a whole
file into a Tuple. The example above stuffs an HTML doc into each line
of the input file. Converts the HTML to XHTML, then does XML stuff.
Unfortunately I haven't been doing any XML processing, so I haven't
written a new Scheme object. Your Scheme class would need to set a
FileInputFormat that reads the whole file or something. sounds like
you may have built this.
Chris K Wensel
ch...@wensel.net
http://chris.wensel.net/
btw, sorry for the late reply. I sent the email out this morning, but
it got hung up somewhere. you probably won't get this one till
tomorrow <grin>
Chris K Wensel
ch...@wensel.net
http://chris.wensel.net/