Missing scheme in request url: %s' % self._url)

1,484 views
Skip to first unread message

NiveRam

unread,
Feb 13, 2017, 9:31:02 AM2/13/17
to scrapy-users
Hi.I'm a beginner in Scrapy, Python. I tried to crawl a set of urls and I came across this error 'Missing scheme in request url: %s' % self._url)  in scrapinghub. Below is the code. 

import scrapy
from bs4 import BeautifulSoup,SoupStrainer
import urllib2
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
import re
import pkgutil
from pkg_resources import resource_string

data = pkgutil.get_data("friday2","resources/urllist.txt")

class FridaySpider (scrapy.Spider):     
    name = 'fridayspider'
    start_urls = [url.strip() for url in data]
    def parse(self, response):                                  
        soup = BeautifulSoup(response.text,'lxml')
        url = response.url
        yield{
        "title" : soup.title.string,
        "url" : response.url,
       }
Thanks in advance :) !! 

Sayth Renshaw

unread,
Feb 14, 2017, 4:28:16 AM2/14/17
to scrapy-users
Thought this from the docs may help.

class MySpider(scrapy.Spider):
name = 'myspider'

def start_requests(self):
return [scrapy.FormRequest("http://www.example.com/login",

https://doc.scrapy.org/en/latest/topics/spiders.html

Sayth

Ian He

unread,
Feb 16, 2017, 8:35:06 AM2/16/17
to scrapy-users
I think request url need a complete url, which means your url should include 'http://' or 'https://'
Reply all
Reply to author
Forward
0 new messages