How to scrap next page items

505 views
Skip to first unread message

rana fge

unread,
Mar 11, 2017, 2:24:15 AM3/11/17
to scrapy-users
hello i am new in programming and scrapy. Trying to learn scrapy i try scrap some items. but unable to do the scrap next page item, please help how parse next link url in this situation. 
Here is my code:
import scrapy

from scrapy.linkextractors import LinkExtractor 
class BdJobs(scrapy.Spider):
name = 'bdjobs'
allowed_domains = ['BdJobs.com']
start_urls = [
     ]
#rules=( Rule(LinkExtractor(allow()), callback='parse', follow=True))

def parse(self, response):
for title in response.xpath('//div[@class="job-title-text"]/a'):
yield {
    'titles': title.xpath('./text()').extract()[0].strip()
    }
   
here is output:
[
{"titles": "Senior Software Engineer (.Net)"},
{"titles": "Java programmer"},
{"titles": "VLSI Design Engineer (Japan)"},
{"titles": "Assistant Executive (Computer Lab-Evening programs)"},
{"titles": "IT Officer, Business System Management"},
{"titles": "Executive, IT"},
{"titles": "Officer, IT"},
{"titles": "Laravel PHP Developer"},
{"titles": "Executive - IT (EDISON Footwear)"},
{"titles": "Software Engineer (PHP/ MySQL)"},
{"titles": "Software Engineer [Back End]"},
{"titles": "Full Stack Developer"},
{"titles": "Mobile Application Developer (iOS/ Android)"},
{"titles": "Head of IT Security Operations"},
{"titles": "Database Administrator, Senior Analyst"},
{"titles": "Infrastructure Delivery Senior Analyst, Network Security"},
{"titles": "Head of IT Support Operations"},
{"titles": "Hardware Engineer"},
{"titles": "JavaScript/ Coffee Script Programmer"},
{"titles": "Trainer - Auto CAD"},
{"titles": "ASSISTENT PRODUCTION OFFICER"},
{"titles": "Customer Relationship Executive"},
{"titles": "Head of Sales"},
{"titles": "Sample Master"},
{"titles": "Manager/ AGM (Finance & Accounts)"},
{"titles": "Night Aiditor"},
{"titles": "Officer- Poultry"},
{"titles": "Business Analyst"},
{"titles": "Sr. Executive - Sales & Marketing (Sewing Thread)"},
{"titles": "Civil Engineer"},
{"titles": "Executive Director-HR"},
{"titles": "Sr. Executive (MIS & Internal Audit)"},
{"titles": "Manager, Health & Safety"},
{"titles": "Computer Engineer (Diploma)"},
{"titles": "Sr. Manager/ Manager, Procurement"},
{"titles": "Specialist, Content"},
{"titles": "Manager, Warranty and Maintenance"},
{"titles": "Asst. Manager - Compliance"},
{"titles": "Officer/Sr. Officer/Asst. Manager (Store)"},
{"titles": "Manager, Maintenance (Sewing)"}
]

note: I have attached a image file for conveniance 

bdjobs.png

Jim Power

unread,
May 2, 2017, 7:55:09 AM5/2/17
to scrapy-users
Hello,

Have you tried using CrawlSpider instead of the base Spider class? https://doc.scrapy.org/en/latest/topics/spiders.html#crawlspider

Jim 
Reply all
Reply to author
Forward
0 new messages