Hi there,
I bumped into your blog while doing some google searches.
I'm trying to figure out how to use firewatir to drive firefox to a
specific page and then scrape a gif off that page into a ruby
variable.
I found a demo on github which demonstrates the idea:
- git clone git://
github.com/scrubber/scrubyt_examples.git
I tried this one:
$ grep firefox ruby_quiz_189.rb
data = Scrubyt::Extractor.define :agent => :firefox do
$ telnet localhost 9997
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Welcome to the Mozilla JavaScript Shell!
> exit()
Goodbye!
Connection closed by foreign host.
$ ruby ruby_quiz_189.rb
/pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/scraping/
filters/download_filter.rb:19:in `download_file': undefined method
`include?' for nil:NilClass (NoMethodError)
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/filters/download_filter.rb:8:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:250:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:248:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:248:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:279:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:279:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:279:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:279:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:275:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
scraping/pattern.rb:270:in `evaluate'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:137:in `evaluate_extractor'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:136:in `each'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:136:in `evaluate_extractor'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:133:in `loop'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:133:in `evaluate_extractor'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:132:in `catch'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:132:in `evaluate_extractor'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:85:in `initialize'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:32:in `new'
from /pt/r1/lib/ruby/gems/1.8/gems/scrubyt-0.4.1/lib/scrubyt/core/
shared/extractor.rb:32:in `define'
from ruby_quiz_189.rb:114
$
As you can see it errors out.
Firewatir seems to be working though; I see that it is pushing my
browser around.
My question is simple,
How do I use firewatir to drive firefox to a page and then scrape a
jpg off that page into a ruby variable?
It looks like scrubyt is designed for exactly the task that I want to
do.
But I can't get it to work.
I've tried various combos of hpricot and mechanize.
Currently I have this:
scrubyt: 4.1
hpricot: 0.5
mechanize: 0.6.3
Since scrubyt is not working for me perhaps I should just focus on
firewatir which seems to be working.
Is it possible to use firewatir to drive firefox to a page and then
scrape a jpg off that page?
I don't need all the muscle of mechanize; I just want a simple jpg at
a specific URL.
The reason I need firefox is it manages cookies for me and skips me
past DOM related JS.
Perhaps you have some tips or clues?
-Audrey