How do i scrape dynamic content from Struts framework with Ruby

fugee ohu

unread,

Dec 25, 2018, 6:16:13 PM12/25/18

to Ruby on Rails: Talk

Hassan Schroeder

unread,

Dec 25, 2018, 6:40:23 PM12/25/18

to rubyonrails-talk

On Tue, Dec 25, 2018 at 3:16 PM fugee ohu <fuge...@gmail.com> wrote:
>
> How do i scrape dynamic content from Struts framework with Ruby

Same as any web source: send a request, parse the response. Is
there some particular issue you're encountering?

--
Hassan Schroeder ------------------------ hassan.s...@gmail.com
twitter: @hassan
Consulting Availability : Silicon Valley or remote

fugee ohu

unread,

Dec 26, 2018, 1:31:29 AM12/26/18

to Ruby on Rails: Talk

On Tuesday, December 25, 2018 at 6:40:23 PM UTC-5, Hassan Schroeder wrote:

On Tue, Dec 25, 2018 at 3:16 PM fugee ohu <fuge...@gmail.com> wrote:
>
> How do i scrape dynamic content from Struts framework with Ruby

Same as any web source: send a request, parse the response. Is
there some particular issue you're encountering?

Hassan Schroeder ------------------------ hassan.s...@gmail.com
twitter: @hassan
Consulting Availability : Silicon Valley or remote

So I just use browser.execute_script and pass in the full path https and query string just as it appears in the name column in Chrome-> Dev Tools?

fugee ohu

unread,

Dec 26, 2018, 1:51:39 AM12/26/18

to Ruby on Rails: Talk

On Tuesday, December 25, 2018 at 6:40:23 PM UTC-5, Hassan Schroeder wrote:

browser.execute_script('https://gpsfront.sitename.com/getI2iRecommendingResults.do?callback=jQuery18307882644047005491_1545806199753&currentItemList=32819755026&categoryId=200001521&shopId=2339135&companyId=238468932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545806304149')

Selenium::WebDriver::Error::UnknownError: unknown error: Runtime.evaluate threw exception: SyntaxError: Unexpected end of input

(Session info: chrome=71.0.3578.80)

(Driver info: chromedriver=2.42.591071 (0b695ff80972cc1a65a5cd643186d2ae582cd4ac),platform=Linux 4.15.0-43-generic x86_64)

Maybe I need to run it without .do extension and also mayb

Rafael Belo

unread,

Dec 26, 2018, 8:19:47 AM12/26/18

to Ruby on Rails: Talk

You are passing an URL instead of a script.

This function "execute_script" it's to execute sobre javascript "script".

You've to visit the page with the browser support.

If you're using Capybara, you shoul use the "visit" function passing the URL that you want to go.

But if not, you've to look which command your driver has to visit URL's.

fugee ohu

unread,

Dec 26, 2018, 9:21:32 AM12/26/18

to Ruby on Rails: Talk

Must I use the url?

Rafael Belo

unread,

Dec 26, 2018, 9:49:28 AM12/26/18

to rubyonra...@googlegroups.com

Yes, if you are using capybara, you may use `visit 'http://myurl.com/goes-here'`

--
You received this message because you are subscribed to a topic in the Google Groups "Ruby on Rails: Talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubyonrails-talk/CpOPHz-zFsc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rubyonrails-ta...@googlegroups.com.
To post to this group, send email to rubyonra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/acaef991-80d3-4c6c-bccc-fdc017ef4734%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Rafael Belo

Web Developer

Skype: rafaelrpbelo

Twitter: @rafaelrpbelo

Linkedin: rafaelrpbelo

fugee ohu

unread,

Dec 26, 2018, 10:08:11 AM12/26/18

to Ruby on Rails: Talk

This is the first time I'm hearing Capybara recommended for web scraping Is this the preferred method for what I'm trying to do?

Rafael Belo

unread,

Dec 26, 2018, 10:25:33 AM12/26/18

to rubyonra...@googlegroups.com

Capybara has a friendly interface for your web drivers, you can integrate it with selenium, webkit, poltergeist and other.

Try to use it, I think you will like it.

https://github.com/teamcapybara/capybara

To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/6d8212f3-dae8-444c-9004-28a4c5b0b103%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

fugee ohu

unread,

Dec 26, 2018, 11:20:42 AM12/26/18

to Ruby on Rails: Talk

I need to work in rails console and when I run `visit ...` rails complains of no matching route

Rafael Belo

unread,

Dec 26, 2018, 11:58:35 AM12/26/18

to rubyonra...@googlegroups.com

You've to include Capybara::DSL.

```

include Capybara::DSL

```

To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/2ae5f27c-eb96-4de2-bc2d-0974387b21c7%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

fugee ohu

unread,

Dec 26, 2018, 2:11:25 PM12/26/18

to Ruby on Rails: Talk

require 'capybara/rails'

include Capybara::DSL

No change I still get the same routing error ActionController::RoutingError (No route matches [GET] "/getI2iRecommendingResults.do"):

Walter Lee Davis

unread,

Dec 26, 2018, 2:15:37 PM12/26/18

to rubyonra...@googlegroups.com

You'll probably get better answers if you show your work. Try writing a single script that demonstrates what you want to do, and post it as a Gist. Link it here, show what the output looks like, and see where that leads you. Often times, working in the constraints of making the example work in a single script forces you to reconsider the problem, or shows you a simple error you made while configuring something more complex.

Walter

Rafael Belo

unread,

Dec 26, 2018, 2:15:57 PM12/26/18

to rubyonra...@googlegroups.com

Which params are you using for `visit`?

To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/33affd05-6ac8-4dd9-b8a6-8258e6796ddd%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

fugee ohu

unread,

Dec 26, 2018, 2:38:12 PM12/26/18

to Ruby on Rails: Talk

visit 'https://gpsfront.sitename.com/getI2iRecommendingResults.do?callback=jQuery18307882644047005491_1545806199753&currentItemList=32819755026&categoryId=200001521&shopId=2339135&companyId=238468932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545806304149'

fugee ohu

unread,

Dec 26, 2018, 5:32:07 PM12/26/18

to Ruby on Rails: Talk

I'd like to write a script after I get my commands down, for now I'm working in rails console It may not be the intended use of capybara to visit url's not defined in routes.rb Should I be using something else to make the request

Walter Lee Davis

unread,

Dec 26, 2018, 6:14:19 PM12/26/18

to rubyonra...@googlegroups.com

Try using an API tool, like Faraday.

gem 'faraday'
require 'faraday'
response = Faraday.get('https://entire.url.of/your/api/data.json')

whatever_parsing_tool_you_want.parse(response.body)

Walter

>
> --
> You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-ta...@googlegroups.com.

> To post to this group, send email to rubyonra...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/ba70d205-f364-4c1e-ba25-b8edd4650c0f%40googlegroups.com.

fugee ohu

unread,

Dec 27, 2018, 9:25:58 AM12/27/18

to Ruby on Rails: Talk

Thanks That works The returned data is delimited as \"name\":value,...

#(Text "/**/jQuery18307882644047005491_1545806199753({\"success\":true,\"code\":0,\"results\":[{\"productId\":32617749905,\"sellerId\":228628782,\"oriMinPrice\":\"US $363.00\",\"oriMaxPrice\"...

Am I gonna have to regex my way through it?

fugee ohu

unread,

Dec 27, 2018, 9:29:40 AM12/27/18

to Ruby on Rails: Talk

I used Nokogiri::HTML.parse(response.body) It isn't converting the javascript response to something more friendly

Rafael Belo

unread,

Dec 27, 2018, 9:38:18 AM12/27/18

to rubyonra...@googlegroups.com

You won't have some friendly parsed javascript response. The javascript it's not information itself, it's a lot of command the will handle browser's DOM. That's why we're using a driver to get this informations.

If you get request and parse it, you'll get the raw html with javascript code, but if you use a driver, then it'll get the response and execute the loaded javascript. This is the key.

To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/96003e5a-48e6-41f8-922b-d67223602b27%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

fugee ohu

unread,

Dec 27, 2018, 10:57:57 AM12/27/18

to Ruby on Rails: Talk

require 'selenium-webdriver

driver=Selenium::Webdriver.for:chrome

driver.get ("https://gpsfront.website.com/getI2iRecommendingResults.do? callback=jQuery18307882644045605491_1545806199753&currentItemList=32819755026&categoryId=200121521&shopId=2339135&companyId=238423932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545800704149")

The result:

=> nil

The path I posted here is ficticious but the real path returned => nil

fugee ohu

unread,

Dec 27, 2018, 11:36:43 AM12/27/18

to Ruby on Rails: Talk

Can you please read my continuing discussion with Rafael

Reply all

Reply to author

Forward