Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A Question on URLLIB

2 views
Skip to first unread message

joy99

unread,
Mar 15, 2011, 2:10:18 AM3/15/11
to
Dear Group,
I am trying to construct a web based crawler with Python and for that
I am using the URLLIB module, and by doing
import urllib and then trying with urllib.urlopen("url).
Am I going fine?
If some one can kindly highlight if I am doing any mistake.
Best Regards,
Subhabrata Banerjee.

David Marek

unread,
Mar 15, 2011, 4:49:31 AM3/15/11
to joy99, pytho...@python.org
You are doing fine so far :-)

However, urllib is quite low level module. Do you know mechanize
http://wwwsearch.sourceforge.net/mechanize/ ? I have used it in my
crawler. If you want to crawl a specific site for known data then
Scrapy http://scrapy.org/ could be useful too.

David Marek
dav...@atrey.karlin.mff.cuni.cz
http://davidmarek.cz

> --
> http://mail.python.org/mailman/listinfo/python-list
>

joy99

unread,
Mar 15, 2011, 5:31:04 AM3/15/11
to
On Mar 15, 1:49 pm, David Marek <dav...@atrey.karlin.mff.cuni.cz>
wrote:

> You are doing fine so far :-)
>
> However, urllib is quite low level module. Do you know mechanizehttp://wwwsearch.sourceforge.net/mechanize/? I have used it in my

> crawler. If you want to crawl a specific site for known data then
> Scrapyhttp://scrapy.org/could be useful too.
>
> David Marek
> dav...@atrey.karlin.mff.cuni.czhttp://davidmarek.cz

>
> On Tue, Mar 15, 2011 at 7:10 AM, joy99 <subhakolkata1...@gmail.com> wrote:
> > Dear Group,
> > I am trying to construct a web based crawler with Python and for that
> > I am using the URLLIB module, and  by doing
> > import urllib and then trying with urllib.urlopen("url).
> > Am I going fine?
> > If some one can kindly highlight if I am doing any mistake.
> > Best Regards,
> > Subhabrata Banerjee.
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
>

Thanks David,I'll surely try to have a check.

0 new messages