ERROR: Mandatory elem description missing, while scraping from detai_page

73 views
Skip to first unread message

Maciek Grodzki

unread,
Jul 7, 2016, 4:28:09 PM7/7/16
to django-dynamic-scraper
Hello,
I'm sorry for spaming but I can't solve this problem. I can't scrape anything from details_page. I need to get whole description of event from that site : http://b1.pl/imprezy. In admin panel I have set everything like in picture below. I check xpath in online tester and it works. I don't get Information that Xpath is wrong.  I have also second i think easier question. Can someone tell me how to configure saving thumbnails to save them in model ImageFields not CharField like in Advanced Topics in documentary. I would be grateful for any help. 















  
:

Holger Drewes

unread,
Jul 11, 2016, 10:36:57 AM7/11/16
to django-dyna...@googlegroups.com
Hi Maciek,
don't know what is the exact problem. But taking from your screenshot: you have to enter one request page type (RPT) PER PAGE, and not PER ITEM SCRAPED FROM A DETAIL PAGE. So if you are just scraping from one detail page, only provide one RPT. It's pretty likely that that's causing your problem, have never checked what would happen if library is used like this with more then one RPT per detail page URL. Maybe I should add a check.

Since DDS is making assumptions on sub folders you shouldn't use ImageField but CharField. ImageField probably won't work.

Greetings
Holger

--
Sie erhalten diese Nachricht, weil Sie in Google Groups E-Mails von der Gruppe "django-dynamic-scraper" abonniert haben.
Wenn Sie sich von dieser Gruppe abmelden und keine E-Mails mehr von dieser Gruppe erhalten möchten, senden Sie eine E-Mail an django-dynamic-sc...@googlegroups.com.
Weitere Optionen finden Sie unter https://groups.google.com/d/optout.

Siggy

unread,
Oct 30, 2016, 7:29:56 PM10/30/16
to django-dynamic-scraper
Dear Holges,

 In relation to RPT, i have configured DDS as  displayed in the screenshot(admin). If i disable the detailed_page every thing is fine. If i create a new scraper configured with the detailed page url then every thing is also fine. However when I scrape the href from the site and look for the elem on the detailed page i receive an error(Mention in the screenshot)...

Do you have any idea what i am doing wrong? Or where to look?

Kind regards,
Siggy

admin.png
RequestPageType.png

Holger Drewes

unread,
Oct 31, 2016, 3:14:35 AM10/31/16
to django-dyna...@googlegroups.com
Hi,
detail pages have to be connected to an url attribute type, your field isin probably is not an url, or is it?

Greetings
Holger 

--
Sie erhalten diese Nachricht, weil Sie in Google Groups E-Mails von der Gruppe "django-dynamic-scraper" abonniert haben.
Wenn Sie sich von dieser Gruppe abmelden und keine E-Mails mehr von dieser Gruppe erhalten möchten, senden Sie eine E-Mail an django-dynamic-scraper+unsub...@googlegroups.com.

Siggy

unread,
Oct 31, 2016, 5:21:38 AM10/31/16
to django-dynamic-scraper
Hi Holger,

Thanks, it is working!  Below my ATTR_TYPE and minor adjustment of SCRAPERS.

General workflow: I scrape my partial @href from the main_page via the URL elem. Via a build in processor(pre_url) create a full URL. In this case my URL elem is configured as a DETAILED_PAGE_URL and will be used for detailed_pages.Then the ISIN elem should be substracted from detailed page.

 
So to achieve this go to the REQUEST PAGE TYPE for the detailed page then select the url attr as SCRAPER OBJ ATTR instead of the ISIN. like:

Kind regards,

Siggy

 
Reply all
Reply to author
Forward
0 new messages