Import module heapq error

346 views
Skip to first unread message

Dan

unread,
Jan 29, 2013, 8:55:15 PM1/29/13
to mr...@googlegroups.com
I'm getting an error on emr. Other than comments, the beginning of the Python file looks like:

import math
from heapq import heappush, heappop, heappushpop
from mrjob.job import MRJob
from mrjob.protocol import RawProtocol, JSONValueProtocol

The stderr log file is:

Traceback (most recent call last):
  File "mr.py", line 8, in <module>
    from heapq import heappush, heappop, heappushpop
ImportError: cannot import name heappushpop
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:372)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:582)
	at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:477)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.DFSClient).
log4j:WARN Please initialize the log4j system properly.

Later I ssh'd into an Amazon emr instance and saw Python 2.6.6 with heapq and all functions. Any ideas?

Brandon Haynes

unread,
Jan 30, 2013, 8:52:25 AM1/30/13
to mr...@googlegroups.com
Hi Dan --

Have you tried explicitly specifying the AMI version for your job (--ami-version 2.3.1)?  While the 2.0.0 AMI should be sufficient for using heapq.heappushpop, you should probably give this a try to ensure that there isn't some strange versioning issue going on.

Brandon

Dan

unread,
Jan 30, 2013, 3:27:18 PM1/30/13
to mr...@googlegroups.com
Thanks Brandon- that solved the problem! I used "--ami-version latest"
Oddly, I forgot to bring back the "from heapq import heappushpop" part but the function call still worked.

Dan
Reply all
Reply to author
Forward
0 new messages