s3 module with checksums for multipart uploads

1,963 views
Skip to first unread message

Ben Hood

unread,
Oct 24, 2013, 5:39:53 AM10/24/13
to ansible...@googlegroups.com
Hi,

I'm having an issue with the s3 module in ansible 1.3.3:

"Files uploaded with multipart of s3 are not supported with checksum,
unable to compute checksum."

The task definition looks like this:

action: s3 bucket=my_bucket
object=/foo/bar.txt
aws_access_key=xxxx
aws_secret_key="yyyy"
dest=/opt/bar.txt
mode=get

So the initial GET works, but any subsequent get fails, presumably due
to the checksum issue. Am I maybe doing something wrong?

Cheers,

Ben

$ ansible --version
ansible 1.3.3 (release1.3.3 291649c15d) last updated 2013/10/23
10:06:59 (GMT +100)

James Cammarata

unread,
Oct 24, 2013, 1:10:08 PM10/24/13
to ansible...@googlegroups.com
Can you share what the error your receiving back is?



--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

James Cammarata <jcamm...@ansibleworks.com>
Sr. Software Engineer, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Ben Hood

unread,
Oct 24, 2013, 11:17:06 PM10/24/13
to ansible...@googlegroups.com
Hi James,

Please find the -vvv output from ansible below.

Cheers,

Ben

TASK: [Get all of the packages from S3 onto the box] **************************

<cdr1.aws.acme.com> ESTABLISH CONNECTION FOR USER: root

<cdr1.aws.acme.com> EXEC ['ssh', '-tt', '-q', '-o',
'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o',
'ControlPath=/Users/0x6e6562/.ansible/cp/ansible-ssh-%h-%p-%r', '-o',
'Port=22', '-o', 'KbdInteractiveAuthentication=no', '-o',
'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey',
'-o', 'PasswordAuthentication=no', '-o', 'User=root', '-o',
'ConnectTimeout=10', 'cdr1.aws.acme.com', "/bin/sh -c 'mkdir -p
$HOME/.ansible/tmp/ansible-1382670810.23-62877702454141 && echo
$HOME/.ansible/tmp/ansible-1382670810.23-62877702454141'"]

<cdr1.aws.acme.com> REMOTE_MODULE s3 bucket=cdr-deployment
object=/3rd-party/jdk-7u45-linux-x64.gz aws_access_key=xxx
aws_secret_key=xxx dest=/opt/downloads/jdk-7u45-linux-x64.gz mode=get

<cdr1.aws.acme.com> PUT
/var/folders/z8/n1d02m0j26z6wv2p_793ls4m0000gn/T/tmpUDmKUP TO
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/s3

<cdr1.aws.acme.com> EXEC ['ssh', '-tt', '-q', '-o',
'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o',
'ControlPath=/Users/0x6e6562/.ansible/cp/ansible-ssh-%h-%p-%r', '-o',
'Port=22', '-o', 'KbdInteractiveAuthentication=no', '-o',
'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey',
'-o', 'PasswordAuthentication=no', '-o', 'User=root', '-o',
'ConnectTimeout=10', 'cdr1.aws.acme.com', "/bin/sh -c
'/usr/bin/python2
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/s3; rm -rf
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/ >/dev/null
2>&1'"]

failed: [cdr1.aws.acme.com] => (item=jdk-7u45-linux-x64.gz) =>
{"failed": true, "item": "jdk-7u45-linux-x64.gz"}

msg: Files uploaded with multipart of s3 are not supported with
checksum, unable to compute checksum.

Chris Rimondi

unread,
Feb 24, 2014, 6:04:02 PM2/24/14
to ansible...@googlegroups.com
Has there been any resolution on this issue? I am running into the same problem and uploading the files as single part versus mulitpart to S3 is not really an option.

Thanks,

Chris

James Cammarata

unread,
Feb 25, 2014, 12:44:59 AM2/25/14
to ansible...@googlegroups.com
I don't believe so, however you might try the workaround suggested in this github issue:





James Cammarata

unread,
Feb 25, 2014, 1:23:53 PM2/25/14
to ansible...@googlegroups.com
So I've dug into this today, and the short answer is: it's possible, but difficult to calculate the MD5 sum of multipart uploads. The main issue is that the size (in MB) of the parts uploaded needs to be known in advance, however Amazon throws that information away once the multipart upload is complete. We could try and guess, however that would take multiple passes and could be very slow (imagine calculating the MD5 sum for 100 parts of a 10GB file 5-10 times - that's a lot of disk activity and time for a single file). Also, wrong guesses would mean people could be incurring additional S3 fees unnecessarily, which is obviously something we would want to avoid at all costs.

So, for the foreseeable future, the s3 module will have to remain dumb about multipart uploads. The primary workaround is as suggested above, to re-upload the file without using the multipart upload feature.

Trenton Strong

unread,
Aug 3, 2014, 1:49:00 PM8/3/14
to ansible...@googlegroups.com, jcamm...@ansible.com
As Chris mentioned, uploading files without multi-part was not an option.

Having recently run into the same issue with the same constraint, I wanted to suggest an alternate workaround of simply renaming or deleting the file currently on disk which will prevent the s3 module from attempting to compute the checksum.  This only works for GET operations.
Reply all
Reply to author
Forward
0 new messages