Missing scheme in request url: %s' % self.

NiveRam

unread,

Feb 13, 2017, 9:31:02 AM2/13/17

to scrapy-users

Hi.I'm a beginner in Scrapy, Python. I tried to crawl a set of urls and I came across this error 'Missing scheme in request url: %s' % self._url) in scrapinghub. Below is the code.

import scrapy

from bs4 import BeautifulSoup,SoupStrainer

import urllib2

from scrapy.selector import Selector

from scrapy.http import HtmlResponse

import re

import pkgutil

from pkg_resources import resource_string

data = pkgutil.get_data("friday2","resources/urllist.txt")

class FridaySpider (scrapy.Spider):

name = 'fridayspider'

start_urls = [url.strip() for url in data]

def parse(self, response):

soup = BeautifulSoup(response.text,'lxml')

url = response.url

yield{

"title" : soup.title.string,

"url" : response.url,

}

Thanks in advance :) !!

Sayth Renshaw

unread,

Feb 14, 2017, 4:28:16 AM2/14/17

to scrapy-users

Thought this from the docs may help.

class MySpider(scrapy.Spider):
name = 'myspider'

def start_requests(self):
return [scrapy.FormRequest("http://www.example.com/login",

https://doc.scrapy.org/en/latest/topics/spiders.html

Sayth

Ian He

unread,

Feb 16, 2017, 8:35:06 AM2/16/17

to scrapy-users

I think request url need a complete url, which means your url should include 'http://' or 'https://'

Reply all

Reply to author

Forward