splitting uploaded CSV data

31 views
Skip to first unread message

Vasu Nagalingam

unread,
Aug 5, 2011, 3:43:46 PM8/5/11
to django...@googlegroups.com
Hi - I am in implementing a batch file import function where the data file is in CSV format. The data file is uploaded (InMemoryFileHandler) successfully, but I am having trouble parsing the CSV data correctly.  

My code: 
for line in imported_data: 
   line = line.split(',') 

My data is organized in this format: <id>,<name>,<address1>,<city>,<province>,<postal> 
0001,"google","123 Main St","san francisco","CA",94000
0002,"microsoft","1 Main St, 9th Floor","Boston","MA",17000

The code works fine for the first row (google). Due to the extra comma in the address 1 element for Microsoft, the import of row is mucked up. Is there a way to indicate the text delimiter as well as data delimiter for the split function? Or, any other options? 



Subhranath Chunder

unread,
Aug 5, 2011, 4:01:14 PM8/5/11
to django...@googlegroups.com
Using the string split on CSV like this is bad in practice. Prefer using CSV manipulation library instead. Like,

You can use the CSV reader module for your purpose.




--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/F0cKmlxJTW8J.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

Vasu Nagalingam

unread,
Aug 5, 2011, 4:17:24 PM8/5/11
to Django users
Thanks. I am on it. Info below for future readers.

What I did was:
1) set the order of the upload file handing in the settings.py so that
default behavior in Django will always be to save the file to drive
instead of keeping it in memory (for files under 2.5M)
FILE_UPLOAD_HANDLERS = (
'django.core.files.uploadhandler.TemporaryFileUploadHandler',
'django.core.files.uploadhandler.MemoryFileUploadHandler',
)

2) in my post upload routine, I opened the file using Python's CSV
manipulation library.
import csv

fname = uploaded_data_file.temporary_file_path()
f = csv.reader(open(fname, 'rb'), delimiter=',', quotechar='"')
for line in f:
co_id = line[0]
co_name = line[1]
co_addr1 = line[2]
and so forth.



On Aug 5, 4:01 pm, Subhranath Chunder <subhran...@gmail.com> wrote:
> Using the string split on CSV like this is bad in practice. Prefer using CSV
> manipulation library instead. Like,http://docs.python.org/library/csv.html
>
> You can use the CSV reader module for your purpose.
>

Subhranath Chunder

unread,
Aug 5, 2011, 4:32:37 PM8/5/11
to django...@googlegroups.com
Instead of writing:
f = csv.reader(open(fname, 'rb'), delimiter=',', quotechar='"')

you could have written only:
f = csv.reader(open(fname, 'rb'))

The 'delimiter' and 'quotechar' defaults to the ones you've specified.
But, it's nothing wrong to be explicit, rather than to rely on the implicit values.
Reply all
Reply to author
Forward
0 new messages