What's wrong in async URL fetch?

22 views
Skip to first unread message

Roman Mirr

unread,
Apr 2, 2019, 10:15:15 AM4/2/19
to EventMachine
I'm trying to fetch multiple URLs asynchronously and only 1 URL succeed.
Please help to investigate a reason from code below.

require 'eventmachine'
require 'em-http-request'

def url_handler(url)
  http
= EventMachine::HttpRequest.new(url).get

  http
.errback {
      puts
"Oops #{url}"
      puts
'-'*20
     
EventMachine.stop
 
}
  http
.callback {
    puts
"#{http.response_header.status} #{url}"
   
# p http.response_header
    p http
.response[0,60]
    puts
'-'*20
   
EventMachine.stop
 
}
end

urls
= [
 
'https://www.cloudflare.com/robots.txt2', # 404, not found
 
'http://example.com/',
 
'https://www.cloudflare.com/robots.txt',
 
'https://rubygems.org/gems/em-http-request/versions.atom',
 
'http://twitter.com/robots.txt',
 
# 'http://localhost/robots.txt',
]

EventMachine.run do
  urls
.each {|url| url_handler(url) }
end

Why errback processed?

An example output:

200 http://example.com/
"<!doctype html>\n<html>\n<head>\n    <title>Example Domain</tit"
--------------------
Oops https://www.cloudflare.com/robots.txt2
--------------------
Oops https://www.cloudflare.com/robots.txt
--------------------
Oops https://rubygems.org/gems/em-http-request/versions.atom
--------------------
Oops http://twitter.com/robots.txt
--------------------

Marcin Bartkowiak

unread,
Sep 19, 2019, 4:00:06 PM9/19/19
to EventMachine
Remove EventMachine.stop from success callback. This code works on my machine:
require 'eventmachine'
require 'em-http-request'

def url_handler(url)
  http
= EventMachine::HttpRequest.new(url).get

  http
.errback {
      puts
"Oops #{url}"
      puts
'-'*20
 
}

  http
.callback {
    puts
"#{http.response_header.status} #{url}"
   
# p http.response_header
    p http
.response[0,60]
    puts
'-'*20
   
EventMachine.stop
 
}
end

urls
= [
 
'https://www.cloudflare.com/robots.txt2', # 404, not found
 
'http://example.com/',
 
'https://www.cloudflare.com/robots.txt',
 
'https://rubygems.org/gems/em-http-request/versions.atom',
 
'http://twitter.com/robots.txt',
 
# 'http://localhost/robots.txt',
]

EventMachine.run do
  urls
.each {|url| url_handler(url) }
end
200 http://twitter.com/robots.txt
"#Google Search Engine Robot\nUser-agent: Googlebot\nAllow: /?_"
--------------------
200 https://www.cloudflare.com/robots.txt
"#    .__________________________.\n#    | .__________________"
--------------------

200 http://example.com/
"<!doctype html>\n<html>\n<head>\n    <title>Example Domain</tit"
--------------------
200 https://rubygems.org/gems/em-http-request/versions.atom
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<feed xmlns=\"http://w"
--------------------
404 https://www.cloudflare.com/robots.txt2
"\n<!DOCTYPE html>\n<html lang=\"en\" itemscope itemtype=\"http://"

Reply all
Reply to author
Forward
0 new messages