( dumbo start ipcount.py -input access.log -output ipcounts)
dumbo cat ipcounts | sort -k2,2nr | head -n 5
Traceback (most recent call last):
File "/usr/local/bin/dumbo", line 8, in <module>
load_entry_point('dumbo==0.21.32', 'console_scripts', 'dumbo')()
File "build/bdist.linux-x86_64/egg/dumbo/__init__.py", line 32, in execute_and_exit
File "build/bdist.linux-x86_64/egg/dumbo/cmd.py", line 42, in dumbo
functions respectively.
File "build/bdist.linux-x86_64/egg/dumbo/cmd.py", line 101, in cat
File "build/bdist.linux-x86_64/egg/dumbo/backends/unix.py", line 114, in cat
TypeError: unsupported operand type(s) for +: 'Options' and 'list'
sudo -u hdfs dumbo start /home/hdfs/ipcount.py -hadoop /opt/hadoop/ -input /user/hdfs/access_log -output /user/hdfs/result
mapreduce job log is:
java.io.IOException: log:null
R/W/S=341/0/0 in:NA [rec/s] out:NA [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=mapred
HADOOP_USER=null
last Hadoop input: |null|
last tool output: |null|
Date: Sat May 12 13:27:48 CST 2012
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:282)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.WritableUtils.writeString(WritableUtils.java:100)
at org.apache.hadoop.typedbytes.TypedBytesOutput.writeString(TypedBytesOutput.java:223)
at org.apache.hadoop.typedbytes.TypedBytesWritableOutput.writeText(TypedBytesWritableOutput.java:182)
at org.apache.hadoop.typedbytes.TypedBytesWritableOutput.write(TypedByte
python version : 2.7.2hadoop version: cdh 3u2========================================
- python install: ok
- dumbo install: ok (test by import dumbo)
- run localhost mode: ok
( dumbo start ipcount.py -input access.log -output ipcounts)
- run localhost mode cat command: error, log below
dumbo cat ipcounts | sort -k2,2nr | head -n 5
Traceback (most recent call last):
File "/usr/local/bin/dumbo", line 8, in <module>
load_entry_point('dumbo==0.21.32', 'console_scripts', 'dumbo')()
File "build/bdist.linux-x86_64/egg/dumbo/__init__.py", line 32, in execute_and_exit
File "build/bdist.linux-x86_64/egg/dumbo/cmd.py", line 42, in dumbo
functions respectively.
File "build/bdist.linux-x86_64/egg/dumbo/cmd.py", line 101, in cat
File "build/bdist.linux-x86_64/egg/dumbo/backends/unix.py", line 114, in cat
TypeError: unsupported operand type(s) for +: 'Options' and 'list'
- run job on cluster: error , common is
sudo -u hdfs dumbo start /home/hdfs/ipcount.py -hadoop /opt/hadoop/ -input /user/hdfs/access_log -output /user/hdfs/resultmapreduce job log is:java.io.IOException: log:null
R/W/S=341/0/0 in:NA [rec/s] out:NA [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=mapred
HADOOP_USER=null
last Hadoop input: |null|
last tool output: |null|
Date: Sat May 12 13:27:48 CST 2012
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:282)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.WritableUtils.writeString(WritableUtils.java:100)
at org.apache.hadoop.typedbytes.TypedBytesOutput.writeString(TypedBytesOutput.java:223)
at org.apache.hadoop.typedbytes.TypedBytesWritableOutput.writeText(TypedBytesWritableOutput.java:182)
at org.apache.hadoop.typedbytes.TypedBytesWritableOutput.write(TypedByte
can somebody help me?thx
--
You received this message because you are subscribed to the Google Groups "dumbo-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/dumbo-user/-/sjjMmtQ7bkcJ.
To post to this group, send email to dumbo...@googlegroups.com.
To unsubscribe from this group, send email to dumbo-user+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/dumbo-user?hl=en.
Answers are inline.
To unsubscribe from this group, send email to dumbo-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/dumbo-user/-/B8O5vxIPslUJ.
To unsubscribe from this group, send email to dumbo-user+...@googlegroups.com.
/usr/local/Python2.7/bin/dumbo start ipcount.py -hadoop /usr/lib/hadoop -input /user/hdfs/access_log -output result -python '/usr/local/bin/python'
So the first issue has been fixed now: https://github.com/klbostee/dumbo/issues/54
For the second one, try clicking on the failed tasks number in the hadoop web interface and then clicking on " 0.21.33 " in the logs column. This should lead you to the stdout and stderr logs for the tasks, which are usually more informative then the java error.
To view this discussion on the web visit https://groups.google.com/d/msg/dumbo-user/-/FM5FrhAJ-fAJ.
To unsubscribe from this group, send email to dumbo-user+...@googlegroups.com.