Custom remote URL downloader

755 views
Skip to first unread message

Walter Lee Davis

unread,
May 29, 2018, 3:50:50 PM5/29/18
to Shrine
I'm looking at this documentation: https://shrinerb.com/rdoc/classes/Shrine/Plugins/RemoteUrl.html#module-Shrine::Plugins::RemoteUrl-label-Custom+downloader

Is Down::Http still the recommended way to extend this? I need to send a custom cookie along with the request for a file to a protected download site. I'm replacing a CarrierWave system that looked like this:

# monkey-patch carrierwave
require 'open-uri'
require 'carrierwave'

CarrierWave::Uploader::Download::RemoteFile.class_eval do
def initialize(uri, headers = {})
@uri = uri
@headers = headers
end
private

def file
if @file.blank?
@file = Kernel.open(@uri.to_s, @headers)
@file = @file.is_a?(String) ? StringIO.new(@file) : @file
end
@file

rescue Exception => e
raise CarrierWave::DownloadError, "could not download file: #{e.message}"
end
end

# ye olde uploader
class DocumentUploader < CarrierWave::Uploader::Base

# add patch to send CID cookie with file requests
attr_accessor :extra_headers

def download!(uri)
processed_uri = process_uri(uri)
@extra_headers = {} unless processed_uri.host.match /med.upenn.edu\Z/
file = RemoteFile.new(processed_uri, @extra_headers)
raise CarrierWave::DownloadError, "trying to download a file which is not served over HTTP" unless file.http?
cache!(file)
end

# Choose what kind of storage to use for this uploader:
storage :file
# storage :fog
unless Rails.env.test? || Rails.env.ceal_staging?
permissions 0664
directory_permissions 02775
end

# Override the directory where uploaded files will be stored.
# This is a sensible default for uploaders that are meant to be mounted:
def store_dir
if Rails.env.test? || Rails.env.ceal_staging?
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id.to_i}"
else
"/data/web/apps/fapd_emc/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
end


# Add a white list of extensions which are allowed to be uploaded.
def extension_white_list
%w(rtf txt doc docx xls xlsx pdf php)
end

# Override the filename of the uploaded files:
# Avoid using model.id or version_name here, see uploader/store.rb for details.
def filename
# Candidate's last name - packet id - type
if original_filename
return "#{model.slug}.#{file.extension}"
end
end
end

# finally, the controller
class PacketDocsController < ApplicationController
load_and_authorize_resource

def update
# Nothing special needed here, because CarrierWave knows what to do if there is
# a :remote_file_url or :file in the params, and does the assignment itself.

# --> here is where the headers are injected
@packet_doc.file.extra_headers = {'Cookie' => 'CID=' + cookies[:CID]}

@packet = @packet_doc.packet(true)
if @packet_doc.update(packet_doc_params)
render template: 'packet_docs/update', layout: false
else
render template: 'packet_docs/update', layout: false
end
end
...
end

Walter

Janko Marohnić

unread,
May 30, 2018, 5:13:02 AM5/30/18
to Walter Lee Davis, Shrine
I've just pushed a commit to master that adds a Shrine::Attacher#assign_remote_url method, which accepts additional downloader options. If you're keeping the default downloader (Down::NetHttp), which interprets options with string keys as request headers, then you can do this:

class PacketDocsController < ApplicationController
  # ...
  def update
    @packet_doc.assign_attributes(packet_doc_params.except(:remote_file_url))

    if (remote_url = packet_doc_params[:remote_file_url])
      @packet_doc.file_attacher.assign_remote_url(remote_url, { 'Cookie' => 'CID=' + cookies[:CID] })
    end

    if @packet_doc.save

      render template: 'packet_docs/update', layout: false
    else
      render template: 'packet_docs/update', layout: false
    end
  end
  # ...
end


If you want to use Down::Http, then you can configure the :downloader proc to forward any additional options, which can conveniently be abbreviated to:

plugin :remote_url, downloader: Down::Http.method(:download)

Then in your controller you would use :headers, as that's how Down::Http.download accepts request headers:

@packet_doc.file_attacher.assign_remote_url(remote_url, headers: { 'Cookie' => 'CID=' + cookies[:CID] })

Let me know if this works for you.

Kind regards,
Janko


Walter

--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.
To post to this group, send email to ruby-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/1EAE9CB4-3AA9-438D-9C1D-2C0C8B4005CA%40wdstudio.com.
For more options, visit https://groups.google.com/d/optout.

Janko Marohnić

unread,
May 30, 2018, 5:53:39 AM5/30/18
to Walter Lee Davis, Shrine
To be more future-proof, I changed downloader options to be accepted via the :downloader option instead:

attacher.assign_remote_url(remote_url, downloader: { 'Cookie' => 'CID=' + cookies[:CID] })              # Down::NetHttp
attacher.assign_remote_url(remote_url, downloader: { headers: { 'Cookie' => 'CID=' + cookies[:CID] } }) # Down::Http

Btw, http.rb also accepts the :cookies option, so the Down::Http version can probably be shortened to

attacher.assign_remote_url(remote_url, downloader: { cookies: { CID: cookies[:CID] } }) # Down::Http

Kind regards,
Janko

Walter Lee Davis

unread,
May 30, 2018, 2:41:20 PM5/30/18
to Janko Marohnić, Shrine
Thanks! This looks really promising. I look forward to getting rid of the monkey-patch. 

Walter

Walter Davis

unread,
Jun 13, 2018, 6:41:25 PM6/13/18
to Shrine
This is very close to working, but I have hit an SSL issue, and I wonder where I can insert a configuration option to force SSL3, as noted here: https://stackoverflow.com/questions/17369962/opensslsslsslerror-ssl-connect-returned-1-errno-0-state-unknown-state-unkn. As noted in that SO question, when I visit the server I need to download from, I get the error 'SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol'. Side note: I'm not concerned about the security ramifications of an old SSL, the two servers I am communicating from and to are behind a lot of firewall and vpn security. 

Walter

Janko Marohnić

unread,
Jun 13, 2018, 7:04:09 PM6/13/18
to Walter Davis, Shrine
I don't think these SSL errors are related to the implementation of the Down gem, but just in case try making a GET request using Net::HTTP/http.rb directly and see if you get this error. If not, then is a bug in the Down gem, in which case I would need to get a URL to the file on the server for debugging purposes.

Kind regards,
Janko

--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.
To post to this group, send email to ruby-...@googlegroups.com.

Walter Lee Davis

unread,
Jun 14, 2018, 12:52:28 PM6/14/18
to Janko Marohnić, Shrine
Thanks kindly for the offer to debug, but this is from my day job at University of Pennsylvania, and both servers are behind a half-dozen firewalls and CoSign security. These URLs are not publicly accessible.

I just tested again, and the original application (with CarrierWave) was able to work, but the differences between that application and the new one are many and varied, down to the major and minor version of Rails. We are not running a modern version of CarrierWave, either. Old software for old servers, I guess.

I will test using the Net::HTTP direct mode, I think you posted some code earlier in this thread that would allow us to hand-roll the downloader. Are you the author of Down as well?

Walter
> To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.
> To post to this group, send email to ruby-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/f0666a2a-82ea-4d28-be75-3b4f183a755b%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Shrine" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.
> To post to this group, send email to ruby-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/CAJ8a-ROH85BsQZAD3b3sniK-1B%2BhdCGYHL66ZcpKqaNLayOVNg%40mail.gmail.com.

Janko Marohnić

unread,
Jun 14, 2018, 2:36:32 PM6/14/18
to Walter Lee Davis, Shrine
I just tested again, and the original application (with CarrierWave) was able to work, but the differences between that application and the new one are many and varied, down to the major and minor version of Rails. We are not running a modern version of CarrierWave, either. Old software for old servers, I guess.

I think the difference might be between Net::HTTP/Ruby versions.

I will test using the Net::HTTP direct mode, I think you posted some code earlier in this thread that would allow us to hand-roll the downloader.

I would recommend just trying to open a console on the remote server, and try to make GET requests directly, and see if you're getting SSL errors.
  1. First you can try open("https://...") provided by open-uri (this is what CarrierWave uses, so it should work)
  2. If it works, then check if Down::NetHttp.download("https://...") also works (it should, as it's just a wrapper around open-uri)
Down::NetHttp is what the remote_url plugin uses by default. When you figure out the SSL options for open-uri, you can just pass them to `Down::NetHttp.download` (by making a custom :downloader).

Since you're migrating from CarrierWave which uses open-uri, it probably makes sense to use `Down::NetHttp.download` (or just `Down.download`), as then you're still using open-uri at the end of the day, so there is less changes between the CarrierWave version. That should hopefully make it easier to figure out the SSL errors. Down::Http is what I would recommend just because I like the http.rb gem much better than Net::HTTP, but if CarrierWave's open-uri worked for you I don't think there is any reason to switch from the default Down backend which uses open-uri.

Are you the author of Down as well?

I am, yes :). I created Down while building Shrine, as I realized I need downloading functionality in more places besides the remote_url plugin (e.g. storage classes, shrine-url gem)

> To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

> To post to this group, send email to ruby-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-shrine/f0666a2a-82ea-4d28-be75-3b4f183a755b%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Shrine" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

> To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 18, 2018, 2:19:03 PM6/18/18
to Shrine
Finally got back to working on this. I was not able to get Down::Http passing the cookie correctly, but that's no doubt down to my code. What did work to get the file was to just do this in irb (after authenticating, so I got the right cookie):

f = open('https://[server.name].upenn.edu/[user]/my/fapd/apps/feds/download.php?cv=12345', 'Cookie'=> 'CID=[REDACTED]')


Here's what I tried to configure Down as my downloader:

# config/initializers/shrine.rb
Shrine.plugin :remote_url, downloader: Down::Http.method(:download), max_size: 50.megabytes

# app/controllers/documents_controller.rb
      @document.file_attacher.assign_remote_url(remote_url,
                                                headers: { 'Cookie' => 'CID=' + cookies[:CID] })

That doesn't work, and my guess is that I have configured the Shrine.plugin part incorrectly.

Additional debugging made it clear to me that the SSL problem may have been an unrelated problem, because I was able to move on to a different error (too many redirects, which means the cookie wasn't passed) when I hard-coded the desired destination address.

Walter

On Tuesday, May 29, 2018 at 3:50:50 PM UTC-4, Walter Davis wrote:

Walter Davis

unread,
Jun 18, 2018, 2:46:14 PM6/18/18
to Shrine
Here's what I have for the configuration in the shrine initializer:

Shrine.plugin :remote_url, max_size: 50.megabytes, downloader: ->(url, max_size:) do
  Down::Http.download(url, max_size: max_size)
end

That's giving me the 'too many redirects' error that tells me that the request is being bounced to our SSO server because the cookie is not being set in the initial request.

On a side note, and unrelated to the use of your code, the stock example from the documentation is giving me a rubocop error (multiline procs should be written using the lambda method), but when I tried to rewrite this as a lambda, I get an error at application startup that I am trying to run a proc without arguments, and the server won't start. I gave up and disabled that cop for now.

Thanks again!

Walter

Janko Marohnić

unread,
Jun 19, 2018, 7:08:44 AM6/19/18
to Walter Davis, Shrine
If you're using the latest master, you need to make sure that you're passing :headers and other downloader options inside the :downloader hash. The following script works for me, and uses https://httpbin.org to verify that cookies are indeed passed, even on redirects:

  require "shrine"

  require "shrine/storage/file_system"

  require "down/http"


  Shrine.storages = {

    cache: Shrine::Storage::FileSystem.new(Dir.tmpdir),

    store: Shrine::Storage::FileSystem.new(Dir.tmpdir),

  }


  Shrine.plugin :determine_mime_type

  Shrine.plugin :remote_url, downloader: Down::Http.method(:download), max_size: 10*1024*1024


  class Photo

    include Shrine::Attachment.new(:image)


    attr_accessor :image_data

  end


  photo = Photo.new

  photo.image_attacher.assign_remote_url("https://httpbin.org/cookies", downloader: {

    headers: { "Cookie" => "CID=12345" }

  })

  puts photo.image.read

  #=> '{"cookies":{"CID":"12345"}}'


  photo = Photo.new

  photo.image_attacher.assign_remote_url("https://httpbin.org/redirect-to", downloader: {

    params:  { url: "https://httpbin.org/cookies" },

    headers: { "Cookie" => "CID=12345" },

  })

  puts photo.image.read

  #=> '{"cookies":{"CID":"12345"}}'


On a side note, and unrelated to the use of your code, the stock example from the documentation is giving me a rubocop error (multiline procs should be written using the lambda method), but when I tried to rewrite this as a lambda, I get an error at application startup that I am trying to run a proc without arguments, and the server won't start. I gave up and disabled that cop for now.

If you're using `lambda`, then you need to use curly braces for the block instead of do...end, because do...end would be interpreted as passing the block to the `Shrine.plugin` method, due to precedence rules, while curly braces have "tighter" rules. To illustrate, the following example:

  puts Benchmark.realtime do
    # work
  end

won't work, because `do...end` is actually passed to the #puts method, not Benchmark.realtime, whereas the following will work:

  puts Benchmark.realtime {
    # work
  }

Kind regards,
Janko

--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.
To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 19, 2018, 1:06:46 PM6/19/18
to Shrine
Okay. Apologies if I have gotten annoying on this subject. I have tried using both Down::Http and Down::NetHttp as the backend, and I am still getting a response which I interpret as meaning that the cookie is not being sent. I am no doubt mixing something up here, so I will try to put the relevant parts here, and hope you can take a moment to correct me.

# config/initializers/shrine.rb
require 'shrine'
require 'shrine/storage/file_system'
require 'down/net_http'

def self.data_filer?
  (!Rails.env.development? && !Rails.env.test? && !Rails.env.ceal_staging?)
end

def self.storage_prefix
  { somdev_migrations: 'ceal_development', somprd_migrations: 'ceal_production' }[Rails.env.to_sym] || Rails.env.to_s
end

Shrine.storages = if data_filer?
                    {
                      cache: Shrine::Storage::FileSystem.new('/data/web/apps/emc',
                                                             prefix: "#{storage_prefix}/uploads/cache"),
                      store: Shrine::Storage::FileSystem.new('/data/web/apps/emc',
                                                             prefix: "#{storage_prefix}/uploads")
                    }
                  else
                    {
                      cache: Shrine::Storage::FileSystem.new('public', prefix: 'uploads/cache'),
                      store: Shrine::Storage::FileSystem.new('public', prefix: 'uploads')
                    }
                  end

Shrine.plugin :activerecord
Shrine.plugin :cached_attachment_data
Shrine.plugin :restore_cached_data
Shrine.plugin :determine_mime_type
Shrine.plugin :remote_url, downloader: ->(url, max_size) { Down::NetHttp.download(url, max_size: max_size) },
                           max_size: 50.megabytes

# app/controllers/documents_controller.rb
  def update
    @document.assign_attributes(document_params.except(:remote_file_url))

    if (remote_url = document_params[:remote_file_url])
      @document.file_attacher.assign_remote_url(remote_url,
                                                downloader: { 'Cookie' => 'CID=' + cookies[:CID] })
    end
    if @document.save
      redirect_to @document.packet
    else
      render :edit
    end
  end

Unless I've completely lost the thread, I believe these two parts (setup and controller) are in synch with one another, and using the correct syntax to send cookies. I've confirmed from the log that the remote_url is being set to the correct value, and visited that URL myself in a browser and got the file after one redirect. Is the default number of redirects allowed not defaulted to 2? 

Thanks again,

Walter
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Janko Marohnić

unread,
Jun 19, 2018, 3:48:38 PM6/19/18
to Walter Davis, Shrine
No worries! :)

So, the remote_url plugin calls the downloader proc with an URL as the first argument, and options (including :max_size) as the second argument. So in your case the `max_size` argument will contain the hash of options, including the `Cookie` key-value pair, and you're passing that hash as the `:max_size` option to Down::NetHttp.download.

You should make `:max_size` a keyword argument (as shown in the documentation) and have it accept additional options, then forward everything to Down::NetHttp.download:

  Shrine.plugin :remote_url, max_size: 50.megabytes, downloader: lambda { |url, max_size:, **options|
    Down::NetHttp.download(url, max_size: max_size, **options)
  }

Now that you have that, you can remove the :downloader option altogether, as Down::NetHttp is remote_url's default downloader, and it already forwards additional options ;)

  Shrine.plugin :remote_url, max_size: 50.megabytes

Kind regards,
Janko

To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 19, 2018, 4:28:04 PM6/19/18
to Shrine
You've confused me even more! 

So I can just use 
    Shrine.plugin :remote_url, max_size: 50.megabytes
as my setup in shrine.rb, and 
    @document.file_attacher.assign_remote_url(remote_url, downloader: { 'Cookie' => 'CID=' + cookies[:CID] })
in my controller?

That's giving me more errors:
download failed: too many redirects
which is where I started today...

Walter

Janko Marohnić

unread,
Jun 19, 2018, 5:57:53 PM6/19/18
to Walter Davis, Shrine
Something is different in your environment, because on my computer the "Cookie" header is clearly sent:

  require "shrine"

  require "shrine/storage/file_system"


  Shrine.storages = {

    cache: Shrine::Storage::FileSystem.new(Dir.tmpdir),

    store: Shrine::Storage::FileSystem.new(Dir.tmpdir),

  }


  Shrine.plugin :determine_mime_type # get rid of the deprecation warning

  Shrine.plugin :remote_url, max_size: 50*1024*1024


  class Photo

    include Shrine::Attachment.new(:image)


    attr_accessor :image_data

  end


  photo = Photo.new

  photo.image_attacher.assign_remote_url("https://httpbin.org/cookies", downloader: { "Cookie" => "CID=12345" })

  puts photo.image.read

  #=> '{"cookies":{"CID":"12345"}}'


  photo = Photo.new

  photo.image_attacher.assign_remote_url("https://httpbin.org/redirect-to?url=/cookies", downloader: { "Cookie" => "CID=12345" })

  puts photo.image.read

  #=> '{"cookies":{"CID":"12345"}}'


What output do you get when you execute this script?


To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 20, 2018, 10:36:44 AM6/20/18
to Shrine
I got an error that the assign_remote_url method wasn't available. I tried installing the master of the gem (as I have in my Rails app) but I can't do that on this server when running as myself. I'm going to try fixing the environment so I can run master, and I'll let you know.

Walter

Walter Davis

unread,
Jun 20, 2018, 10:49:26 AM6/20/18
to Shrine
I managed to get master installed, and got the same results you reported with the cookie being successfully transmitted and set. Yesterday, in my frustration, I wrote a controller method that just used open-air and it successfully downloaded the file from the server, but then I could not work out how to assign the file data to my model. I tried file= and assign file: but got a JSON encoding error instead. I'm typing this by memory at the moment, but this is more or less what it did:

def download(url)
  require 'open-uri'
  f = File.open(url, 'rb', 'Cookie' => 'CID=' + cookies[:CID])
  @document.file = f.read # and other attempts at assignment
end

I could echo out the file that I got in this way through the console, and see that it was the actual generated RTF file I was expecting, rather than the HTML file I would see if the cookie wasn't there (and I was redirected to the SSO login page). 

Another thing that has occurred to me here is that maybe I am being redirected more than the default 2 hops, and I just don't know it. You've mentioned in the documentation that open-uri doesn't limit redirects at all when used natively, and your wrappers do so they will be good citizens and not get DOSed by a misconfigured target page. 

In your last comment yesterday, you said that the default behavior (without writing a custom downloader proc) was to just accept and pass through additional options. I haven't tried this yet, but what chances do you give this configuration to work if the real problem is the limited number of allowed redirects?

Shrine.plugin :remote_url, max_size: 50.megabytes, follow: { max_hops: 8 }

Thanks,

Walter

Walter Davis

unread,
Jun 20, 2018, 10:54:26 AM6/20/18
to Shrine
Well, I answered my own question here: this particular syntax is probably incorrect, and if redirects is the problem, it is definitely being ignored, because I get the same error back: too many redirects.

Walter

Janko Marohnić

unread,
Jun 20, 2018, 11:06:46 AM6/20/18
to Walter Davis, Shrine
> Yesterday, in my frustration, I wrote a controller method that just used open-air and it successfully downloaded the file from the server, but then I could not work out how to assign the file data to my model. I tried file= and assign file: but got a JSON encoding error instead.

You need to assign the IO object itself, not its content. When you assign a string, Shrine assumes you're assigning uploaded file JSON data. So, the corrected code would be:

  def download(url)
    require 'open-uri'
    file = open(url, 'rb', 'Cookie' => 'CID=' + cookies[:CID])
    @document.file = file
  end

But I would still recommend Down.download you're expecting many redirects, because as you already remembered, open-uri doesn't limit the number of redirects.

In your last comment yesterday, you said that the default behavior (without writing a custom downloader proc) was to just accept and pass through additional options. I haven't tried this yet, but what chances do you give this configuration to work if the real problem is the limited number of allowed redirects?

Yes, for sure, :max_redirects is just another option for Down::NetHttp, so the following should work:

  @document.file_attacher.assign_remote_url(remote_url, downloader: { max_redirects: 8 })

Shrine.plugin :remote_url, max_size: 50.megabytes, follow: { max_hops: 8 }

Please, make sure you don't mix Down::NetHttp and Down::Http options (the options above are for Down::Http). Let's continue talking only in terms of Down::NetHttp, and later you can decide to switch to Down::Http if you want.

The remote_url plugin doesn't accept downloader options on load, you have to either override the :downloader and pass options there, or pass options to the new Attacher#assign_remote_url.

--
You received this message because you are subscribed to the Google Groups "Shrine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 20, 2018, 11:47:37 AM6/20/18
to Shrine
Okay, we are much closer to success now. I have gotten to the page requesting a login (not what I want, but at least I can see it and understand where the failure is happening). Just to triple check, I have done these three things:

1. shrine.rb: require 'shrine'; require 'shrine/storage/file_system'; Shrine.plugin :remote_url, max_size: 50.megabytes

2. controller: 
  def import(remote_url)
    Rails.logger.info "#{remote_url}, downloader: { 'Cookie' => #{'CID=' + cookies[:CID]}, max_redirects: 8 }"
    @document.file_attacher.assign_remote_url(remote_url,
                                              downloader: { 'Cookie' => 'CID=' + cookies[:CID],
                                                            max_redirects: 8 })
  end

3. controller:
  def update
    @document.assign_attributes(document_params.except(:file_remote_url))
    if (remote_url = document_params[:file_remote_url])
      import(remote_url)
    end
    if @document.save
      redirect_to @document.packet
    else
      render :edit
    end
  end

This works without error, but it downloads the login page. I have double-checked that the cookie being set and sent is the same one that is live in my browser. Can you double-check my settings here, that I am using NetHttp syntax all the way through? That is the default, if you don't specify or require an additional Down flavor, right? And would you say that NetHttp is the closest thing to open-uri, options-wise? The fact that I am being redirected to SSO is a symptom of the cookie not being sent correctly, in my experience.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine...@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.

Walter Davis

unread,
Jun 20, 2018, 11:59:19 AM6/20/18
to Shrine
More notes inline below:


On Wednesday, June 20, 2018 at 11:47:37 AM UTC-4, Walter Davis wrote:
Okay, we are much closer to success now. I have gotten to the page requesting a login (not what I want, but at least I can see it and understand where the failure is happening). Just to triple check, I have done these three things:

1. shrine.rb: require 'shrine'; require 'shrine/storage/file_system'; Shrine.plugin :remote_url, max_size: 50.megabytes

2. controller: 
  def import(remote_url)
    Rails.logger.info "#{remote_url}, downloader: { 'Cookie' => #{'CID=' + cookies[:CID]}, max_redirects: 8 }"

I have double-checked that if I paste the remote_url value into the browser where I am testing (cookie is set) the correct file downloads, not the login page.
 
    @document.file_attacher.assign_remote_url(remote_url,
                                              downloader: { 'Cookie' => 'CID=' + cookies[:CID],

I have also triple-checked that the cookie I see in my browser's console is the exact same value as the one I am sending in the downloader: hash when it runs from the server (and fails there).

Walter Davis

unread,
Jun 20, 2018, 5:02:27 PM6/20/18
to Shrine
I have this working without the plugin, by using this method (bastardized from your blog post about Down):

  def import(remote_url)
    # @document.file_attacher.assign_remote_url(remote_url,
    #                                           downloader: { 'Cookie' => "CID=#{cookies[:CID]}",
    #                                                         max_redirects: 8 })
    require 'open-uri'
    uri = URI.parse(remote_url)
    io = uri.open('Cookie' => "CID=#{cookies[:CID]}", 'User-Agent' => 'EMC-bot')
    downloaded = Tempfile.new([File.basename(uri.path), '.rtf'])
    downloaded.write(io.read)
    downloaded.rewind

    @document.file = downloaded
  end

I was not able to get the trick with `mv` to work (I got an empty file, even though the download part worked perfectly), so I gave up so my story could get reviewed. I really want to get this working correctly, but I'll take "working" for now.

Walter

Janko Marohnić

unread,
Jun 20, 2018, 6:26:11 PM6/20/18
to Walter Davis, Shrine
Your setup looks correct, and I think I know what might be the problem. Down.download implements its own redirects, due to not being able to limit open-uri's redirects, and there might be something that Down does incorrectly here regarding cookies.

When Down.download follows a redirect, it reads the "Set-Cookie" response header of the redirect response, and if it's present it assigns it to the "Cookie" request header for the follow-up request. So if the remote server returns "Set-Cookie" which doesn't include the "CID=..." value that was sent, then "CID=..." will not be sent in the redirect request.

Could you check in the browser in the network requests inspector if that might be the case? In any case I will see how I can resolve this.

And would you say that NetHttp is the closest thing to open-uri, options-wise?

Yes, Down::NetHttp.download accepts all options that open-uri accepts, and some additional as well. So, if replacing open-uri with Down::NetHttp.download doesn't work:

  def import(remote_url)
    require 'down'
    downloaded = Down.download(remote_url, 'Cookie' => "CID=#{cookies[:CID]}", 'User-Agent' => 'EMC-bot', max_redirects: 8)

    @document.file = downloaded
  end

then the issue is definitely in the behaviour that Down::NetHttp.download modifies from open-uri, most probably following redirects.

To unsubscribe from this group and stop receiving emails from it, send an email to ruby-shrine+unsubscribe@googlegroups.com.

To post to this group, send email to ruby-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages