Advise on how to download multiple files as a single zipped file.

722 views
Skip to first unread message

Yiheng

unread,
Nov 2, 2016, 2:34:51 AM11/2/16
to Shrine
Hi,

I'm seeking advice on how to download multiple files (like tens of GB or even more) from S3, zip them all, and then pass to user as one single zipped file. (I didn't notice any plugin on shrine can do this.)

Currently I can think one way is, to download all to server local disk, zip them, then pass to user. Since both my web/worker (using ROR on EC2) server don't have large storage, I have to spin an separated, dedicated server to manage this alone.

Appreciate if anyone could advice a better way.


Best,
Yiheng

Janko Marohnić

unread,
Nov 2, 2016, 5:04:05 AM11/2/16
to Yiheng, Shrine
Here is how you could zip S3 files with Shrine using rubyzip:

require "zip"

require "tempfile"


zip_file = Tempfile.new(["files", ".zip"], binmode: true)

zip_stream = Zip::OutputStream.write_buffer(zip_file) do |zip|

  uploaded_files.each do |uploaded_file|

    zip.put_next_entry(uploaded_file.original_filename)

    uploaded_file.open { |io| IO.copy_stream(io, zip) }

  end

end

zip_file.fsync # flush any buffered data to disk

zip_file.rewind


So, the idea is to write the S3 files directly to the zip stream as you're downloading them. This way you're not storing the file on your filesystem. Shrine has a way to open an IO to the uploaded file (whichever storage you're using), and then `IO.copy_stream` will copy the file to the stream in chunks. This is then both storage efficient and memory efficient.

The `uploaded_files` here should be an array of Shrine::UploadedFile objects. I'm assuming that these S3 files are attachments to some database records, so these objects are what the Shrine attachment getter method returns:

photo.image #=> #<Shrine::UploadedFile>


Then if you want to do this zipping on a request, you can use something like `send_file` to stream the zip file to the response body.

send_file zip_file.path


Let me know if this helps!

Kind regards,
Janko

--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.
To post to this group, send email to ruby-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/c786b8cb-85b1-4bd7-8301-f7be4ffd1a4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yiheng

unread,
Nov 2, 2016, 5:46:22 AM11/2/16
to Shrine
Hi Janko,

Thank you for the help. I see the direct load S3 file into zip stream is a efficient way. 

Just to clarify the things after that, the zip_file will then saved on the server disk ('zip_file.fsync'), and lastly send it to the user right ? (if so, this might still be a problem as one zip file could be 50GB, a few requests simultaneously could result hundreds GB which exceeded my single server disk size)



Best,
Yiheng
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.

Janko Marohnić

unread,
Nov 2, 2016, 5:59:38 AM11/2/16
to Yiheng, Shrine
Yes, that's right. If you know of any better place that the zip contents could be stored, let me know and we'll see how we can modify the code.

Maybe rubyzip has a feature of streaming the zipped contents as we are writing to the zip, I haven't looked very deeply into the gem. Perhaps some other Ruby library for zipping could have such a feature.

Kind regards,
Janko

To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Yiheng

unread,
Nov 2, 2016, 6:05:31 AM11/2/16
to Shrine
I see. I'll explore and reply again if I found a better way.

Thank you for the guide and the wonderful shrine : )

Sankalp Dwivedi

unread,
Nov 21, 2022, 1:37:18 AM11/21/22
to Shrine
Hi, did you find a way to do it??
Reply all
Reply to author
Forward
0 new messages