Hi all,
After spending some time trying to figure it out, I can't manage to use a custom inputreader. My intention was to use as inputreader the following Mahout class : org.apache.mahout.classifier.bayes.XmlInputFormat, in order to process xml wikipedia dumps. (as proposed in the following thread :
https://groups.google.com/forum/#!msg/mrjob/4J-Kdw3AXMI/WIBlzSSLGxcJ)
But according to my tests, the input seems to be processed as raw text, without any use of the org.apache.mahout.classifier.bayes.XmlInputFormat
May be I am totally missing the point with the org.apache.mahout.classifier.bayes.XmlInputFormat class ?
Would have you any clue to help me ? It will be very appreciated =)
François.