How to insert extract data into mysql database.

1,803 views
Skip to first unread message

masroor javed

unread,
Apr 16, 2014, 6:35:34 AM4/16/14
to scrapy...@googlegroups.com
Hi All I tried to insert data into mysql database but data could not be insert and there is no error during run the crawl.
My pipeline code is below and pls suggest me how to insert.
Is there any command for scrapy crawl to insert into mysql data same as excel like "scrapy crawl -o datafile.csv -t csv" or simply run the spider like " scrapy crawl spidername"?
Please help me guys i am new in scrapy.

import sys
import MySQLdb
import hashlib
from scrapy.exceptions import DropItem
from scrapy.http import Request

class PagitestPipeline(object):
  def __init__(self):
self.conn = MySQLdb.connect("localhost","root"," ","test",charset="utf8", use_unicode=True )
    self.cursor = self.conn.cursor()

def process_item(self, item, spider):
    try:
        self.cursor.execute("INSERT INTO infosec (titlename, standname)  
                        VALUES ('%s', '%s')", 
                       (item['titlename'].encode('utf-8'), 
                        item['standname'].encode('utf-8')))

        self.conn.commit()


    except MySQLdb.Error, e:
        print "Error %d: %s" % (e.args[0], e.args[1])


    return item

Bill Ebeling

unread,
Apr 16, 2014, 10:02:26 AM4/16/14
to scrapy...@googlegroups.com
I would build the insert like so:

db = MySQLdb.connect("localhost","root"," ","test",charset="utf8", use_unicode=True )
cursor = db.cursor()
sql = "INSERT INTO infosec (titlename, standname) VALUES ('%s', '%s')"
args =  (item['titlename'].encode('utf-8'), item['standname'].encode('utf-8'))
cursor.execute(sql, args)
db.commit()


But I would also make it a function and call the function from the pipe.

As for not getting errors, I would put some log entries all over the pipe and print out the status of the item and make sure that the items are even getting to the DB call.

Hope that helps a little,

Bill

Svyatoslav Sydorenko

unread,
Apr 16, 2014, 5:46:49 PM4/16/14
to scrapy...@googlegroups.com
Try using this pipeline..
http://snipplr.com/view/66986/mysql-pipeline/

Середа, 16 квітня 2014 р. 13:35:34 UTC+3 користувач masroor javed написав:

masroor javed

unread,
Apr 17, 2014, 12:35:44 AM4/17/14
to scrapy...@googlegroups.com
Hi Suyatoslav, i tried to put codes from given pipeline but data could not inserted.
Well i m also confused that is there any other command to insert the data.
firstly i use to this command to run the spider  " spider crawl spidername"
may i know this command is also work for pipeline code or not? because i used this command but data couldn't inserted and there is no error.
so please let me know what should i do?


--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

masroor javed

unread,
Apr 17, 2014, 12:52:29 AM4/17/14
to scrapy...@googlegroups.com
sorry "scrapy crawl spidername" instead of "spider crawl spidername"

Svyatoslav Sydorenko

unread,
Apr 17, 2014, 4:07:41 AM4/17/14
to scrapy...@googlegroups.com
Did you add it to ITEM_PIPELINES dict first?
http://doc.scrapy.org/en/latest/topics/item-pipeline.html#activating-an-item-pipeline-component

Четвер, 17 квітня 2014 р. 07:35:44 UTC+3 користувач masroor javed написав:

masroor javed

unread,
Apr 17, 2014, 5:09:19 AM4/17/14
to scrapy...@googlegroups.com
No i didn't add.
may i knw wher i add this ?
i mean in pipline code or settings?

masroor javed

unread,
Apr 17, 2014, 6:25:13 AM4/17/14
to scrapy...@googlegroups.com
Hi Suyatoslav i have inserted data into database.

Thank you so much for help me.
I wish u could help me in future.

Thank You.


On Thu, Apr 17, 2014 at 1:37 PM, Svyatoslav Sydorenko <svyat...@sydorenko.org.ua> wrote:

Mukesh Salaria

unread,
May 14, 2014, 2:33:29 PM5/14/14
to scrapy...@googlegroups.com
Hey Masroor,

I am also newbie like you in scrapy as you insert data in database, could you please let me know what steps i have to follow to insert data in database.

Regrads,
Mukesh

masroor javed

unread,
May 15, 2014, 12:16:25 AM5/15/14
to scrapy...@googlegroups.com
ya sure could you share ur code here?

Anurag Sharma

unread,
Dec 5, 2014, 8:40:31 AM12/5/14
to scrapy...@googlegroups.com
Hi masroor,

i want to insert my data in MYSQl
please find my code

MY SPIDER

import scrapy

from craigslist_sample.items import AmazonDepartmentItem
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors import LinkExtractor

class AmazonAllDepartmentSpider(scrapy.Spider):

    name = "amazon"
    allowed_domains = ["amazon.com"]
    start_urls = [
        "http://www.amazon.com/gp/site-directory/ref=nav_sad/187-3757581-3331414"
    ]
    def parse(self, response):
        for sel in response.xpath('//ul/li'):
            item = AmazonDepartmentItem()
            item['title'] = sel.xpath('a/text()').extract()
            item['link'] = sel.xpath('a/@href').extract()
            item['desc'] = sel.xpath('text()').extract()
        return item


MY PIPELINE



import sys
import MySQLdb
import hashlib
from scrapy.exceptions import DropItem
from scrapy.http import Request

class MySQLStorePipeline(object):


    host = 'derr.com'
    user = 'amazon'
    password = 'mertl123'
    db = 'amazon_project'

    def __init__(self):
        self.connection = MySQLdb.connect(self.host, self.user, self.password, self.db)
        self.cursor = self.connection.cursor()


    def process_item(self, item, spider):   
        try:
            self.cursor.execute("""INSERT INTO amazon_project.ProductDepartment (ProductDepartmentLilnk) 
                            VALUES (%s)""",
                           (
                            item['link'].encode('utf-8')))


            self.conn.commit()

        except MySQLdb.Error, e:
            print "Error %d: %s" % (e.args[0], e.args[1])
        return item

and i am running this command
scrapy crawl amazon


thanks
Reply all
Reply to author
Forward
0 new messages