Files not present in cache on Heroku

79 views
Skip to first unread message

desa.al...@gmail.com

unread,
Aug 23, 2016, 9:59:28 AM8/23/16
to Shrine
Hey there!

I am using Shrine to handle file uploads in my JSON API. I set Heroku's tmp directory as the cache and S3 as the permanent storage. I also use the backgrounding plugin with Sidekiq to promote files to S3.

Everything works fine in my development environment. In production, though, the promotion job fails because it can't find the cached file when it runs.

This is my Shrine initializer:

require 'shrine/storage/file_system'
require 'shrine/storage/s3'

Shrine.plugin :activerecord
Shrine.plugin :reform
Shrine.plugin :determine_mime_type
Shrine.plugin :backgrounding

Shrine.storages = {
  cache: Shrine::Storage::FileSystem.new(Rails.root.join('tmp/uploads'))
}

Shrine.storages[:store] = if Rails.env.production?
  Shrine::Storage::S3.new(
    access_key_id: ENV.fetch('AWS_ACCESS_KEY_ID'),
    secret_access_key: ENV.fetch('AWS_SECRET_ACCESS_KEY'),
    region: ENV.fetch('AWS_REGION'),
    bucket: ENV.fetch('AWS_BUCKET'),
  )
else
  Shrine::Storage::FileSystem.new('public', prefix: 'uploads/store')
end

Shrine::Attacher.promote { |data| UploadFilePromotionJob.perform_later(data) }
Shrine::Attacher.delete { |data| UploadFileDeletionJob.perform_later(data) }

And here's my promotion job:

# frozen_string_literal: true
class UploadFilePromotionJob < ActiveJob::Base
  queue_as :default

  def perform(data)
    Shrine::Attacher.promote(data)
  end
end


Can you tell what's wrong? As far as I can see, the files should be in Rails' tmp directory, ready to be promoted, but they are not there for some reason.

Thanks,

Alessandro

Janko Marohnić

unread,
Aug 23, 2016, 2:18:36 PM8/23/16
to desa.al...@gmail.com, Shrine
Since you're using Sidekiq, I'm assuming that you're running it on a separate worker dyno. The problem is that each dyno has its own filesystem, so if you upload the file to your web dyno, your worker dyno won't be able to see it.

I highly recommend doing direct uploads to S3. This way you can show the cached file while the background job is running without the download_endpoint plugin, and you also don't have the risk of the user hitting Heroku's 30-second request limit.

You can see the direct_upload documentation for more details, and I also wrote a guide that's linked on the website. You can also see the Roda or Rails demo app.

Kind regards,
Janko
--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.
To post to this group, send email to ruby-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/869bce4b-12c3-4af9-be9a-2a7869cf3953%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alessandro Desantis

unread,
Aug 24, 2016, 8:37:06 AM8/24/16
to Janko Marohnić, Shrine
Thanks for the quick reply, Janko!

I thought direct uploads to S3 might be a solution, but how can I do that in a JSON API? Basically, I want to implement something like Stripe's uploads API: https://stripe.com/docs/api/ruby#file_uploads.

Would it work to simply have the cache storage be S3 rather than the filesystem?

Best,
Alessandro

To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

Janko Marohnić

unread,
Aug 24, 2016, 2:08:04 PM8/24/16
to Alessandro Desantis, Shrine
I see. In that case it's probably doesn't make sense the require the client upload the file to S3 before sending it the JSON API. It makes more sense that they upload the file through your API, and that the API takes care where to save the file, like you sensed already.

If S3 was set as cache, that would have a significant performance impact on the request, because in addition to uploading the file to your API, your API would also need to upload it to S3. Since this would be synchronous, it would also impact the throughput of your app.

It would be great to use something for cache that is distributed (shared between dynos), but still fast to "upload" to. I think MongoDB's GridFS would be a good solution for that, and there is already a shrine-gridfs integration. I don't know how cheap is a MongoDB Heroku addon, though.

If you don't expect large files to be uploaded, another option would be to cache it to an SQL database with shrine-sql. Heroku counts the number of rows you have in the database, so that could be feasible. Note that in both cases you might want to load the delete_promoted plugin to delete cached files after they have been promoted to permanent storage.

Kind regards,
Janko
Reply all
Reply to author
Forward
0 new messages