I am trying to reproduce very simple examples from the basic tutorials.
With the current install, I can run "my first map reduce program" from
https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.mdsuccessfully.
When I try to run "my second map reduce program" from the same page, the reduce function hangs up for a long, long time.
The tail of the syslog file looks like this:
2013-12-19 14:54:52,597 WARN org.apache.hadoop.mapred.ReduceTask: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
at sun.net.www.http.HttpClient.New(HttpClient.java:290)
at sun.net.www.http.HttpClient.New(HttpClient.java:306)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1636)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1593)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1493)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1401)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1333)
2013-12-19 14:54:52,597 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201312191055_0006_r_000000_0: Failed fetch #5 from attempt_201312191055_0006_m_000001_0
2013-12-19 14:54:52,597 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 adding host
hit-nxdomain.opendns.com to penalty box, next contact in 37 seconds
2013-12-19 14:54:52,597 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0: Got 1 map-outputs from previous failures
2013-12-19 14:55:22,599 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 Need another 2 map output(s) where 0 is already in progress
2013-12-19 14:55:22,599 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 Scheduled 0 outputs (1 slow hosts and0 dup hosts)
2013-12-19 14:55:22,599 INFO org.apache.hadoop.mapred.ReduceTask: Penalized(slow) Hosts:
2013-12-19 14:55:22,599 INFO org.apache.hadoop.mapred.ReduceTask:
hit-nxdomain.opendns.com Will be considered after: 7 seconds.
2013-12-19 14:55:32,599 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
2013-12-19 14:56:22,604 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 Need another 2 map output(s) where 1 is already in progress
2013-12-19 14:56:22,604 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201312191055_0006_r_000000_0 Scheduled 0 outputs (0 slow hosts and1 dup hosts)
I would expect to find errors in stderr, but there are zero lines and zero characters in stderr and stdout.
-rw-r--r-- 1 user user 152 12月 19 14:56 log.index
-rw-rw-r-- 1 user user 0 12月 19 14:37 stderr
-rw-rw-r-- 1 user user 0 12月 19 14:37 stdout
-rw-rw-r-- 1 user user 22488 12月 19 14:55 syslog
I imagine this "second map reduce" should be able to run in a few minutes, but it's taken more than 23 minutes.
I suspect that no matter how long I let it run, it's going to repeat the same error message, namely that it has one map output, but it needs another 2 map outputs.
Any suggestions are welcome.
Thanks.