Final Project Description

230 views
Skip to first unread message

Wesley May

unread,
Nov 16, 2012, 10:54:46 AM11/16/12
to csc-32...@googlegroups.com
Hi all,

Here's the final project description. I'll be posting crawler.py a little bit later. The tutorial on Monday will cover the project, so read through it and bring any questions you have :)
csc326proj.pdf

Wesley May

unread,
Nov 16, 2012, 2:55:51 PM11/16/12
to csc-32...@googlegroups.com
Here's the crawler.py file (never mind the .pys extension... you can change it to .py if you want)

http://www.petergoodman.me/courses/2011/csc326/project/crawler.pys

You will require the BeautifulSoup package to run the crawler. You'll notice that there are a bunch of TODOs in the crawler also. These are for you to fill in :D

There's a bit of info about the crawler from last year's TA here: https://groups.google.com/forum/?fromgroups=#!topic/csc-326-2011/dLMOmUrEedE

There's also a simple PageRank implementation you can use / build off of here: http://www.petergoodman.me/courses/2011/csc326/project/pagerank.pys


Camilla Tabis

unread,
Nov 16, 2012, 3:17:30 PM11/16/12
to csc-32...@googlegroups.com
Hi,

So from the project description from the website, there is this crawler attached:
http://www.eecg.toronto.edu/~jzhu/csc326/readings/crawler.py

I'm confused about which one we are supposed to be using.

Thanks!

Wesley May

unread,
Nov 16, 2012, 4:08:23 PM11/16/12
to csc-32...@googlegroups.com
As far as I know, you can actually use *any* method you want of crawling webpages. That means you can use this crawler, or the other one I posted. There seem to indeed be a few different versions floating around. If you want to make your own from scratch, you can do that too :D

Wesley May

unread,
Nov 16, 2012, 4:09:22 PM11/16/12
to csc-32...@googlegroups.com
In either case, you'll have to look through the crawler and add some pieces. The one you linked to looks like it might be a little bit simpler to understand and use.

Jianwen Zhu

unread,
Nov 16, 2012, 6:17:38 PM11/16/12
to csc-32...@googlegroups.com
Hi all,

The course web site has been updated with the course project description Wesley posted on
the forum. For the reference implementations of crawler and page rank algorithm, let's use the
one Wesley posted (the current pointer in the course website also points to this one now).
In any case, they are provided as reference and you are welcome to improve over them.

Regards,

JZ

Lucas Costa Oliveira

unread,
Nov 27, 2012, 4:47:14 PM11/27/12
to csc-32...@googlegroups.com
Good Afternoon,

Is there any way i can use another framework for developing the project ?

Thanks,

Lucas

Wesley May

unread,
Nov 27, 2012, 4:50:58 PM11/27/12
to
Yes, if there's something else you're familiar with (like Tornado or so) I have no problem with that. HOWEVER, if you decide to use something besides Bottle, then your code needs to be *very* well commented so that I can follow it. As usual, your instructions for how to run your project must be very clear, ESPECIALLY if I need to install anything.
Message has been deleted

Lucas Costa Oliveira

unread,
Nov 27, 2012, 7:08:22 PM11/27/12
to csc-32...@googlegroups.com
Thanks,

I will be using GAE ( Google App Engine ) .

The code will be very commented !

Thanks for the answer =D
Reply all
Reply to author
Forward
0 new messages