Hi all,
I have undertaken my diploma thesis on Hadoop MapReduce and I have been requested to I do an application written in MapReduce.
I found on internet this code and I ran the code :
How can I add in that code, to stores all text on webpages somewhere locally on HDD (text only, not Images) and then I have to be processed .;
ie,
I should a Mapreduce code, which would download web pages from the web and store on the local file system and not the HDFS.
After ,I run the quest-search (program) in order to not depend on network speed.
Because ,my network is so slow.
I do this to improvement performance.
I am running Hadoop Version 0.20.2 .
I am new to Hadoop and am kinda lost and any help would be greatly appreciated.
Sorry for my bad English.search.
Thanks in advance for any assistance !