Hi Steve,
I don't think it's the file IO part. My file is stored in HDFS and I
am using the standard HDFS read API. Basically, in the open method, I
open a reader of the file. In the nextTuple, it reads a line. I then
performs some post processing, such as splitting the string and then
emitting them.
I tested this by also commenting the emit call and simply printing a
message when a file has been completely read and it takes only a few
seconds but with emit not commented, it takes longer to finish.
-Harold