CSV - important

2 views
Skip to first unread message

mdipierro

unread,
Dec 22, 2008, 5:38:28 AM12/22/08
to web2py Web Framework
I have rewritten the CSV io functions (in trunk). Now they work
better, are more portable and should work on GAE.
They also have one cool new feature:

Consider this:

db=SQLDB('sqlite:memory:')
db.define_table('t1',SQLField('name'),SQLField('test'))
db.define_table('t2',SQLField('a',db.t1),SQLField('name'))
db.t1.insert(name='max')
db.t2.insert(a=1,name='tim')

The t2 record references the t1 record. Let's dump both tables:

open('t1.csv','w').write(db(db.t1.id>0).select())
open('t2.csv','w').write(db(db.t2.id>0).select())

They contain:

t1.id,t1.name
1,max
t2.id,t2.a,t2.name
1,1,tim

Now reimport, in the naive way,

db.t1.import_from_csv_file(open('t1.csv','r'))
db.t2.import_from_csv_file(open('t2.csv','r'))

and you get

t1.id,t1.name
1,max
2,max
t2.id,t2.a,t2.name
1,1,tim
2,1,tim

Nothing new here. The record t2.id==2 references t1.id==1 instead of
t1.id==2. This is not what one would expect since the new t2 record
should reference the new t1 record.

To fix this you can instead import in the following way:

remap={}
db.t1.import_from_csv_file(open('t1.csv','r'),remap)
db.t2.import_from_csv_file(open('t2.csv','r'),remap)

and you get

t1.id,t1.name
1,max
2,max
t2.id,t2.a,t2.name
1,1,tim
2,2,tim

Now t2.id==2 reference t1.id==2 instead of t1.id==1. This is one would
expect.

Summary: if you need to import multiple tables that reference each
other, pass a dictionary (the same) to all calls to
import_from_csv_file and web2py will take care of remapping the
reference indices. The dictionary is used to store the mapping between
the old 'id' and the new 'id'.

This can be used for remote database synchronization using CSV.

None is serialized as <NULL> to avoid csv shortcomings.

Massimo

mmstud

unread,
Dec 22, 2008, 6:48:54 AM12/22/08
to web2py Web Framework
I could think a lot of different ways to mark null "value": n/a, NA,
n.a., None, NULL and so on. If csv file is used on another
applications than on import data to web2py itself, then there should
be a way to manually decide, how null should be marked and which data
should be interpreted as null. So i would set an optional argument,
which default is '<NULL>', but is changeable.

-Marko

mdipierro

unread,
Dec 22, 2008, 8:02:25 AM12/22/08
to web2py Web Framework
New stuff in trunk:

1) customizable None/Null serialization

db(db.table.id>0).select().export_to_csv_file(open
('filename','w'),null='<NULL>')
db.table,import_from_csv_file(open('filename','r'),null='<NULL>')

2) complete database dump

db.export_to_csv_file(open('filename','w'))
for table in db.tables: db[table].truncate()
db.import_from_csv_file(open('filename','r'))

tables must be defined to be re imported.

Massimo
Reply all
Reply to author
Forward
0 new messages