Downloading/exporting a csv file when clicked on a button in web.py python

1,465 views

Skip to first unread message

shiva krishna

unread,

Feb 22, 2013, 5:03:16 AM2/22/13

to we...@googlegroups.com

I am using python `web.py` framework to build a small web app.

It consists of a

1. `Home page` that takes a url as input

2. Reads `anchor text` and `anchor tags` from it

3. Writes it to csv file and downloads it

Here the steps 2 and 3 happens when we clicked on a `export the links` button, below is my code

**code.py**

import web

from web import form

import urlparse

from urlparse import urlparse as ue

import urllib2

from BeautifulSoup import BeautifulSoup

import csv

from cStringIO import StringIO

urls = (

'/', 'index',

'/export', 'export',

)

app = web.application(urls, globals())

render = web.template.render('templates/')

class index:

def GET(self):

return render.home()

class export:

def GET(self):

i = web.input()

if i.has_key('url') and i['url'] !='':

url = i['url']

page = urllib2.urlopen(url)

html = page.read()

page.close()

decoded = ue(url).hostname

if decoded.startswith('www.'):

decoded = ".".join(decoded.split('.')[1:])

file_name = str(decoded.split('.')[0])

csv_file = StringIO()

csv_writer = csv.writer(csv_file)

csv_writer.writerow(['Name', 'Link'])

soup = BeautifulSoup(html)

for anchor_tag in soup.findAll('a', href=True):

csv_writer.writerow([anchor_tag.text,anchor_tag['href']])

web.header('Content-Type','text/csv')

web.header('Content-disposition', 'attachment; filename=%s.csv'%file_name)

return csv_file.getvalue()

if __name__ == "__main__":

app.run()

**home.html**:

$def with()

<html>

<head>

<title>Home Page</title>

</head>

<body>

<form method="GET" action='/export'>

<input type="text" name="url" maxlength="500" />

<input class="button" type="submit" name="export the links" value="export the links" />

</form>

</body>

</html>

The above html code displays a form with a text box that takes a url , and has button `export the links` button that `downloads/exports` the csv file with the anchor tag links and text.

1. For example when we submit `http://www.google.co.in` and click `export the links`, all the anchor urls and anchor text are saving in to csv file and downloading successfully

2. but for example when we given the other url like `http://stackoveflow.com` immediately and click `export the links` button, the csv file (created with domain name of the url as shown in the above code) is downloading with tag links , but the downloaded csv file also contains the data(anchor text and links) of the previous url that is `http://www.google.co.in`.

That is the data is overrriding in the same csv file from different urls, can anyone please let me know whats wrong in the above code(`export class`) that generates the csv file, why the data is overwriting instead of creating a new csv file with the different name created dynamically ?

Finally my intention is to download/export a new csv file with domain name(sliced as above in my code) of the url by writing data (anchor tag text and url ) from the url in to it each time when we give the new url.

Can anyone please extend/make necessary changes to my above code to download an individul csv file for individual url .........

Scott Gelin

unread,

Feb 22, 2013, 8:40:11 PM2/22/13

to we...@googlegroups.com

I think it may have to do with the fact that you never close the stream you're writing to. Have you tried printing the output of csv_file before writing to it? It would help you ensure it was empty before throwing more lines into it. I added in two comments near the top of the code snippet - feel free to uncomment the second one for the relevant information.

Anyway, I modified the bottom section of your export class. I've never worked particularly with either csv or cStringIO, so no guarantees this fixes your problem, but it seems like leaving the stream open could be causing issues.

csv_file = StringIO()

#debugging info

#print csv_file.getvalue()

csv_writer = csv.writer(csv_file)

csv_writer.writerow(['Name', 'Link'])

soup = BeautifulSoup(html)

for anchor_tag in soup.findAll('a', href=True):

csv_writer.writerow([anchor_tag.text,anchor_tag['href']])

web.header('Content-Type','text/csv')

web.header('Content-disposition', 'attachment; filename=%s.csv'%file_name)

returnval = csv_file.getvalue()

csv_file.close()

return returnval

--
You received this message because you are subscribed to the Google Groups "web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webpy+un...@googlegroups.com.
To post to this group, send email to we...@googlegroups.com.
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward

0 new messages