analysis Keywords from web pages text[Urgent]

30 views
Skip to first unread message

Ramkrishan Bhatt

unread,
Dec 6, 2013, 6:37:37 AM12/6/13
to web...@googlegroups.com
Hello,
   I need to build a keyword generator from the any search output. For example if i am searching car in any search engine, based on car i am getting lots of result now in this page i need to find out whatever keywords can be made from the words.

I need to return most possible word as keywords after analyzing a HTML page. I  am using  now BeautifulSoup , Please also suggest me if we can implement http://scrapy.org/ for this task.

def myfunc():
    form=FORM('Key Word Finder',
              INPUT(_name='URL', requires=IS_NOT_EMPTY()),
              INPUT(_type='submit'))
    if form.accepts(request,session):
        response.flash = 'form accepted'
        new_url=request.vars.URL
        usock = urllib2.urlopen(new_url)
        data = usock.read()
        usock.close()
        soup = BeautifulSoup(data)
        findKeywords=soup.get_text()
        key_words=Counter()
        key_words.update(findKeywords.split())
        return dict( form=form , grid = key_words.most_common())
    elif form.errors:
        response.flash = 'form has errors'
    else:
        response.flash = 'please fill the form'
    return dict(form=form)  

Leonel Câmara

unread,
Dec 6, 2013, 4:37:21 PM12/6/13
to web...@googlegroups.com
That's not how you find keywords, you need to use something like tf-idf. Removing stop words would also be useful.

I don't really see how this is web2py related so maybe I'm misunderstanding something.
Reply all
Reply to author
Forward
0 new messages