[Errno 104] Connection reset by peer

1,574 views
Skip to first unread message

Ножкин Андрей

unread,
Dec 16, 2013, 10:31:53 AM12/16/13
to mr...@googlegroups.com
Oh guys all these map-reduce tasks makes me sad... It's not hard to code the algorithm - it's hard to set up a tool to execute it! Really sad =(

In short: "socket.error: [Errno 104] Connection reset by peer" exception. The script actually has access to S3 because it does create buckets and uploads some small files (I've checked manually via AWS console). But the largest file - INPUT - is not uploaded. Hey, it's just 7GB of test data!

Have tried 4 times, always got errors.

mrjob==0.4.2

CONFIG
# cat /etc/mrjob.conf 
runners:
  inline:
    base_tmp_dir: /home/tmp
  emr:
    base_tmp_dir: /home/tmp

    aws_access_key_id: [VALID KEY HERE]
    aws_secret_access_key: [VALID SECRET HERE]
    aws_region: us-east-1
    ec2_instance_type: m1.medium
    num_ec2_instances: 7

# python /home/bigdata/mr_job_1.py -r emr  /home/filesystem/INPUT > /home/filesystem/OUTPUT
using configs in /etc/mrjob.conf
creating new scratch bucket mrjob-f02b7cd37b2bfffd
using s3://mrjob-f02b7cd37b2bfffd/tmp/ as our scratch dir on S3
creating tmp directory /home/tmp/mr_job_1.root.20131216.152251.298419
writing master bootstrap script to /home/tmp/mr_job_1.root.20131216.152251.298419/b.py
creating S3 bucket 'mrjob-f02b7cd37b2bfffd' to use as scratch space
Copying non-input files into s3://mrjob-f02b7cd37b2bfffd/tmp/mr_job_1.root.20131216.152251.298419/files/
Traceback (most recent call last):
  File "/home/bigdata/workers/process_data/swap_keys_and_sites.py", line 178, in <module>
    MRSwapData().run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 494, in run
    mr_job.execute()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 512, in execute
    super(MRJob, self).execute()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 147, in execute
    self.run_job()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 208, in run_job
    runner.run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/runner.py", line 458, in run
    self._run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 806, in _run
    self._prepare_for_launch()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 817, in _prepare_for_launch
    self._upload_local_files_to_s3()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 905, in _upload_local_files_to_s3
    s3_key.set_contents_from_filename(path)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1290, in set_contents_from_filename
    encrypt_key=encrypt_key)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1221, in set_contents_from_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 713, in send_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 889, in _send_file_internal
    query_args=query_args
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 547, in make_request
    retry_handler=retry_handler
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 947, in make_request
    retry_handler=retry_handler)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 908, in _mexe
    raise e
socket.error: [Errno 104] Connection reset by peer

Steve Johnson

unread,
Dec 16, 2013, 10:56:50 AM12/16/13
to mr...@googlegroups.com
You should upload the file to  S3 before running mrjob on it. You don't want to have to wait for 7 GB of data to upload every time you run your script anyway.
--
You received this message because you are subscribed to the Google Groups "mrjob" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mrjob+un...@googlegroups.com.
 

Ножкин Андрей

unread,
Dec 17, 2013, 10:18:32 AM12/17/13
to mr...@googlegroups.com
Hey Steve - thanks for the reply. It helped a little bit. I mean - it helped to run the job, but now I've got another issue, probably more complicated than the previous one.

I'll try to explain in short, in simplified code.
def mapper(line):
  j = json.loads(line)
  j['somekey'] = 'somevalue'
  yield j['anotherkey'], j

def combiner(key, value):
  for v in value:
    somevar = v['somekey']

...
so, mapper method updates some dict and yields that dict. Combiner method contains the code that reads the key that was supposed to be added in the mapper method. Everything works well on the test data, if running in local mode. But on Amazon EMR, the 'v' variable in the combiner method doesn't have that key! It has all the others, but not the key that I set in the code. The whole data was re-checked and re-uploaded to the S3 bucket. So everything looks really strange. I also tried to run the mapreduce task on a smaller subset, around 300MB, and it worked as it supposed to be.

Thanks guys for your help.

Ножкин Андрей

unread,
Dec 17, 2013, 10:20:13 AM12/17/13
to mr...@googlegroups.com
Forgot to mention - I use PickleProtocol as the internal protocol, for some reasons. Maybe it helps.

Jimmy Retzlaff

unread,
Dec 17, 2013, 12:22:35 PM12/17/13
to mr...@googlegroups.com
Keep in mind that Hadoop may call your combiner 0, 1, or more times on a given stream of data. This means your combiner output data must be compatible with its input data (and your reducer must be OK with the combiner never being called). Does your combiner yield (key, value) pairs where the values always contain 'somekey'? Typically you only want to add a combiner as an optimization (early versions of mrjob didn't even support combiners), so one approach is to get everything working without the combiner first.

Jimmy

--

Ножкин Андрей

unread,
Dec 17, 2013, 10:31:26 PM12/17/13
to mr...@googlegroups.com
Didn't know that. Thank you. It seems everything is ok now.
Reply all
Reply to author
Forward
0 new messages